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Chapter 1. Electric Charge Interaction 

This brief chapter describes the basics of electrostatics, the study of interactions between static (or 
slowly moving) electric charges. Much of this material should be known to the reader from his or her 
undergraduate studies; because of that, the explanations will be very succinct . 1 


1.1. The Coulomb law 

A serious discussion of the Coulomb law 2 requires a common agreement on the meaning of the 
following notions: 3 

- electric charges q k , as revealed, most explicitly, by experimental observation of electrostatic 
interaction between the charged particles; 

- electric charge conservation, meaning that the algebraic sum of q k of all particles inside any 
closed volume is conserved, unless the charged particles cross the volume’s border; and 

- a point charge, meaning the charge of an ultimately small (“point”) particle whose position in 
space may be completely described (in a given reference frame) by its radius-vector r = np '1 + nyy + 

where n 7 (with j = 1, 2, 3) are unit vectors directed along 3 mutually perpendicular axes, and r, are 
the corresponding Cartesian components of r. 

I will assume that these notions are well known to the reader - though my strong advice is to give 
some thought to their vital importance. Using them, the Coulomb law for the electrostatic interaction of 
two point charges in otherwise free space may be formulated as follows: 

Coulomb j. _ j. 

“ F w=Kqk q e ^—^ t (1.1) 

2 pomt L. j. Y 

charges I* k '\ 

where ¥ kk - denotes the force exerted on charge number k by charge number k ’. This law is certainly very 
familiar to the reader, but several remarks may still be due: 

(i) Flipping indices k and k’, we see that Eq. (I) 4 complies with the 3 ld Newton law: the 
reciprocal force is equal in magnitude but opposite in direction: ¥ k - k = -¥ kk \ 

(ii) According to Eq. (1), the magnitude of the force, F kk -, is inversely proportional to the square 
of the distance between the two charges - the well-known undergraduate-level formulation of the 
Coulomb law. 


F «' = mAr 


U -r*. 




1 For remedial reading, virtually any undergraduate text on electricity and magnetism may be used; I can 
recommend either the classical text by I. Tamm, Fundamentals of Theory of Electricity, Mir, 1979, or the more 
readily available textbook by D. Griffiths, Introduction to Electrodynamics, 3 rd ed., Prentice-Flail, 1999. 

2 Discovered experimentally in the early 1780s, and formulated in 1785 by C.-A. de Coulomb. 

3 On the top of the more general notions of classical Cartesian space, point particles and forces, which are used 
in classical mechanics - see, e.g., CM Sec. 1.1. (Acronyms CM, SM, and QM refer to other three parts of my 
lecture note series. In those parts, this Classical Electrodynamics part is referred to as EM.) 

4 As in all other parts of my lecture notes, chapter numbers are omitted in references to equations, figures, and 
sections within the same chapter. 
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(iii) Since vector (r* - r k ) is directed from point r k toward point r k (Fig. 1), Eq. (1) implies that 
charges of the same sign (i.e. with q k q k - > 0) repulse, while those with opposite signs (q k q k ’ < 0) attract 
each other. 



(iv) Constant re in Eq. (1) depends on the system of units we use. In the Gaussian units, t< is set 
to 1, for the price of introducing a special unit of charge (the statcoulomb ) that would fit the 
experimental data for Eq. (1), for forces F kk ’ measured in Gaussian units {dynes). On the other hand, in 
the International System (“SI”) of units, the charge unit is one coulomb (abbreviated C), 5 6 close to 3xl0 9 
statcoulombs, and k is different from unity: 6 

Electric 

( 1 .2) force 

constant 

Unfortunately, the continuing struggle between zealot proponents of these two systems bears all 
ugly features of a religious war, with a similarly slim chances for any side to win it in any foreseeable 
future. In my humble view, each of these systems has its advantages and handicaps (to be noted on 
several occasions below), and every educated physicist should have no problem with using any of them. 
Following insisting recommendations of international scientific unions, I will mostly use SI units, but 
for readers’ convenience, duplicate the most important formulas in the Gaussian units. 

Besides Eq. (1), another key experimental law of electrostatics is the linear superposition 
principle : the electrostatic forces exerted on some point charge (say, q k ) by other charges do not affect 
each other and add up as vectors to form the net force: 

F,=2X„ (1.3) 

k'*k 

where the summation is extended over all charges but q k , and the partial force F kk - is described by Eq. 

(I). 7 The fact that the sum is restricted to k’ ^ k means that a point charge does not interact with itself. 


= — = io-V. 
4 ns 


5 In the formal metrology, one coulomb is defined as the charge carried over by a constant current of one ampere 
(see Ch. 5 for its definition) during one second. 

6 Constant so is called either the electric constant or the free space permittivity, from Eq. (2) with the free-space 
speed of light c ~ 3xl0 8 m/c, e 0 ~ 8.85xl0' 12 SI units. For more accurate values of the constants, and their brief 
discussion, see appendix CA: Selected Physical Constants. 

7 Physically this is a very strong statement: it means that Eq. (1) is valid for any pair of charges regardless of 

presence of other charges, i.e. not only in the free space, but in also placed into an arbitrary medium. The apparent 

modification of this relation by conductors (Ch. 2) and dielectrics (Ch. 3) is just the result of appearance of 

additional electric charges within those media. 
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This fact may look trivial from Eq. (1), whose right-hand part diverges at r k — » iy, but becomes less 
evident (though still true) in quantum mechanics where the charge of even an elementary particle is 
effectively spread around some volume, together with particle’s wavefunction. 8 

Now we may combine Eqs. (1) and (3) to get the following expression for the net force F acting 
on some charge q located at point r: 


F = 


q 


l 

4 ns 


- z ?*- 

o r*'*r 



(1.4) 


Electric 

field: 

definition 


This equation implies that it makes sense to introduce the notion of the electric field at point r, as an 
entity independent of the probe charge q, characterized by vector 



( 1 - 5 ) 


Coulomb 
law for 
system 
of point 
charges 


formally called the electric field strength - but much more frequently, just the “electric field”. In these 
terms, Eq. (4) becomes 


E(r) = 


1 

4 ns 


- X d k 

o r t '*r 



(1.6) 


This concept is so appealing that Eq. (5) is used well beyond the boundaries of free-space electrostatics. 
Moreover, the notion of field becomes virtually unavoidable for description of time-dependent 
phenomena (such as electromagnetic waves), where the electromagnetic field shows up as a specific 
form of matter, with zero rest mass, and hence different from the usual “material” particles. 


Many problems involve many point charges q k \ qr, ■■■, located so closely that it is possible to 
approximate them with a continuous charge distribution. Indeed, for a group of charges within a very 
small volume d 3 r’, with the linear size satisfying strong condition dr « | r k - iy|, the geometrical factor 
in Eq. (6) is essentially the same. As a result, all these charges may be treated as a single charge dQ(r ’). 
Since this charge is proportional to ar we can define the local (3D) charge density p (r ’) by relation 9 

p(r')d 3 r' = dQ(r') = , (1.7) 

r k ,ed 3 r' 


and rewrite Eq. (6) as 

E(r) = -Z_ ^dQir')-^- = £ p ( r ')rf , 

4 ™o d i r ' r-rl 4 ^o d V r-rl 


(1.8) 


8 Moreover, there are some widely used approximations, e.g., the Kohn-Sham equations in the density functional 
theory of multiparticle systems, which essentially violate this law, thus limiting the accuracy and applicability of 
these approximations - see, e.g., QM Sec. 8.4. 

9 The 2D (areal) charge density a and ID (linear) density A may be defined absolutely similarly: dQ = od 2 r, dQ = 
Adr. Note that a finite value of a and A means that the volume density p is infinite in the charge location points; 
for example for a plane z = 0, charged with a constant areal density a, p= adz). 
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i.e. as the integral (over the whole volume containing all essential charges): 10 


E(r) %ij 

>(r') 

r - r 
r - r' 

-jd'r'. 


(1.9) 


Coulomb 
law for 
continuous 
charge 
distribution 


It is very convenient that Eq. (9) may be used even in the case of discrete point charges, 
employing the notion of Dirac’s function, * 11 which is a mathematical approximation for a very sharp 
function equal to zero everywhere but one point, and still having a finite (unit) integral. Indeed, in this 
formalism, a set of point charges q located in points tv may be presented by the pseudo-continuous 
distribution with density 

p{r’) = Y J q k '8{r'-r k ,). (1.10) 

k' 

Plugging this expression into Eq. (9), we come back to the discrete version (6) of the Coulomb law. 


1.2. The Gauss law 

Due to the extension to point (“discrete”) charges, it may seem that Eqs. (5) and (9) is all we 
need for solving any problem of electrostatics. In practice, this is not quite true, first of all because the 
direct use of Eq. (9) frequently leads to complex calculations. Indeed, let us consider a very simple 
example: the electric field produced by a spherically-symmetric charge distribution with density fir ’). 
We may immediately use the problem symmetry to argue that the electric field should be also 
spherically-symmetric, with only one component in spherical coordinates: E(r)= E(r) n, where n r = xtr is 
the unit vector in the direction of the field observation point r (Fig. 2). 



Fig. 1.2. One of the simplest problems of 
electrostatics: electric field produced by a 
spherically-symmetric charge distribution. 


Taking this direction as the polar axis of a spherical coordinate system, we can use the evident 

2 2 

independence of the elementary radial field dE, created by the elementary charge p(r’)d r’ = fi r ’)r ’ sin# 
dr’ dO’dtp’, of the azimuth angle (p\ and reduce integral (9) to 

i 71 CO / 

E = 2n [sin 6'd6'\ r' 2 dr' — } n cos#, (1.11) 

4^o l l (r"Y 


10 Note that for a continuous, smooth charge distribution, integral (9) does not diverge at R = r - r ’ — » 0, because 
in this limit the fraction under the integral increases as R' 2 , i.e. slower than the decrease of the elementary volume 
d 2 r’, proportional to R 3 . 

11 See, e.g., Sec. 14 of the Selected Mathematical Formulas appendix, referred below as MA. 
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where 6* and r” are the geometrical parameters marked in Fig. 2. Since they all may be readily expressed 
via r ’ and 6’ using auxiliary parameters a and h, 

cos6* = - — — , (r") 2 = h 2 +(r-r'cos<9) 2 , a = r' cos&, h = r' sin#', (1.12) 

r" 

integral (11) may be eventually reduced to an explicit integral over r’ and ff . and worked out 
analytically, but that would require some effort. 

For more complex problem, integral (8) may be much more complex, defying an analytical 
solution. One could argue that with the present-day abundance of computers and numerical algorithm 
libraries, one can always resort to numerical integration. This argument may be enhanced by the fact 
that numerical integration is based on the replacement of the integral by a sum, and summation is much 
more robust to (unavoidable) discretization and rounding errors than the finite-difference schemes 
typical for the numerical solution of differential equations. 

These arguments, however, are only partly justified, since in many cases the numerical approach 
runs into a problem sometimes called the curse of dimensionality, in which the last word refers to the 
number of input parameters of the problem to be solved, i.e. the dimensionality of its parameter space. 
Let us discuss this issue, because it is common for most fields of physics and, more generally, any 
quantitative science. 12 

If the number of the parameters of a problem is small, the results of its numerical solution may 
be of the same (and in some sense higher) value than the analytical ones. For example, if a problem has 
no parameters, and its result is just one number (say, /4), this “analytical” answer hardly carries more 
information than its numerical form 2.4674011... Now, if a problem has one input parameter (say, a), 
the result of an analytical approach in most cases may be presented as an analytical function ^(a). If the 
function is really simple, called elementary, with many properties well known (say, fa) = sin a), this 
function gives us virtually everything we want to know. However, if the function is complicated, you 
would need to tabulate it numerically for a set of values of parameter a and possibly present the result as 
a plot. The same results (and the same plot) can be calculated numerically, without using analytics at all. 
This plot may certainly be very valuable, but since the analytical form has a potential of giving you 
more infonnation (say, the values of fa) outside the plot range, or the asymptotic behavior of the 
function), it is hard to say that the numerics completely beat the analytics here. 

Now let us assume that you have more input parameters. For two parameters (say, a and b), 
instead of one curve you would need a family of such curves for several (sometimes many) values of b. 
Still, the plots sometimes may fit one page convenient for viewing, so it is still not too bad. Now, if you 
have three parameters, the full representation of the results may require many pages (maybe a book) full 
of curves, for four parameters we may speak about several bookshelves, for five parameters something 
like a library, etc. For large number of parameters, typical for many scientific problems, the number of 
points in the parameters space grows exponentially, even the volume of calculations necessary for the 
generation of this data may become impracticable, despite the dirt-cheap CPU time we have now. 

Thus, despite the current proliferation of numerical methods in physics, analytical results have 
an ever-lasting value, and we should try to get them whenever we can. For our current problem of 
finding electric field generated by a fixed set of electric charges, large help comes from the Gauss law. 


12 Actually, the term “curse of dimensionality” was coined in the 1950s by R. Bellman in the context of the 
optimal control theory, and only later spread to other sciences that heavily rely on numerical calculations. 
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Let us consider a single point charge q inside a smooth, closed surface A (Fig. 3), and calculate 
product E n ar, where d 2 r is an infinitesimal element of the surface (which may be well approximated 
with a plane of that area), and E n is the component of the electric field in that point, normal to that 
plane. 



This component may be calculated as Ecosd, where 6 is the angle between vector E and the unit 
vector n normal to the surface. (Equivalently, E n may be presented as the scalar product E-n.) Now let 
us notice that the product cos 6 d~r is nothing more than the area dr’ of the projection of dr onto the 
plane perpendicular to vector r connecting charge q with this point of the surface (Fig. 3), because the 
angle between the planes dr’ and d r is also equal to 0. Using the Coulomb law for E, we get 

E n d 2 r = E cos0d 2 r = — %-d 2 r'. (1.13) 

4ns 0 r 

2 2 . . 2 

But the ratio dr ’/r is nothing more than the elementary solid angle dCl under which the areas dr ’ and 
d 2 r are seen from the charge point, so that E n d 2 r may be presented as just a product of dCl by a constant 
(q/4nsd). Summing these products over the whole surface, we get 

{E n d 2 r = -^—Ua = ^-, (1.14) 

s 4 ^o S ^0 

since the full solid angle equals 4n. (The integral in the left-hand part of this relation is called the flux of 
electric field through surface S .) 

Equation (14) expresses the Gauss law for one point charge. However, it is only valid if the 
charge is located inside the volume limited by the surface. In order to find the flux created by a charge 
outside of the volume, we still can use Eq. (13), but to proceed we have to be careful with the signs of 
the elementary contributions E n dA. Let us use the common convention to direct the unit vector n out of 
the closed volume we are considering (the so-called outer normal), so that the elementary product E n d 2 r 
= (E-n )<tr and hence c/Q = E n d r’/r is positive if vector E is pointing out of the volume (like in the 
example shown in Fig. 3a and the upper-right area in Fig. 3b), and negative in the opposite case (for 
example, in the lower-left area in Fig. 3b). As the latter figure shows, if the charge is located outside of 
the volume, for each positive contribution dCl there is always equal and opposite contribution to the 
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integral. As a result, at the integration over the solid angle the positive and negative contributions cancel 
exactly, so that 

$E n d 2 r = 0. (1.15) 

s 


Gauss 

law 


The real power of the Gauss law is revealed by its generalization to the case of many charges 
within volume V. Since the calculation of flux is a linear operation, the linear superposition principle (3) 
means that the flux created by several charges is equal to the (algebraic) sum of individual fluxes from 
each charge, for which either Eq. (14) or Eq. (15) are valid, depending on the charge position (in or out 
of the volume). As the result, for the total flux we get: 



(1.16) 


where Qv is the net charge inside volume V. This is the full version of the Gauss law. 


In order to appreciate the problem-solving power of the law, let us return to the problem 
presented in Fig. 2, i.e. a spherical charge distribution. Due to its symmetry, which had already been 
discussed above, if we apply Eq. (16) to a sphere of radius r, the electric field should be perpendicular to 
the sphere at each its point (i.e., E n = E ), and its magnitude the same at all points: E n = E = E{r). As a 
result, the flux calculation is elementary: 

§E n d 2 r = 4ftr 2 E{r). (1-17) 


Now, applying the Gauss law (16), we get: 


4 7tr 2 E{r) 



J p(r')d 3 r' = 

r'<r 


A T 

— — [ r’ 2 p{r')dr ’ , 

F J 


(1.18) 


so that, finally, 


E(r) = 


— y— \r’ 2 p{r’)dr ' 
r 0 


1 Q(r) 

4 ?T£ 0 r 2 


where Q(r) is the full charge inside the sphere of radius r. 


Q(r) = 4 k J p(r')r' 2 dr'. 
o 


(1.19) 


( 1 . 20 ) 


In particular, this fonnula shows that the field outside of a sphere of a finite radius R is exactly 
the same as if all its charge Q = Q{R ) was concentrated in the sphere’s center. (Note that this important 
result is only valid for any spherically-symmetric charge distribution.) For the field inside the sphere, 
finding electric field still requires an explicit integration (20), but this ID integral is much simpler than 
the 2D integral (1 1), and in some important cases may be readily worked out analytically. For example, 
if charge Q is unifonnly distributed inside a sphere of radius R, 


pir') = p = -p- 


Q 

{4ft n>)R 2 ’ 


( 1 . 21 ) 
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the integration is elementary: 


r 

E(r ) = f r' 2 dr' 

r *0 o 


pr 

3^o 


1 Qr 

4 7T£ 0 R 3 


( 1 . 22 ) 


We see that in this case the field is growing linearly from the center to the sphere’s surface, and only at r 
> R starts to decrease in agreement with Eq. (19) with constant Q(r) = Q. Another important observation 
is that the results for r<R and r > R give the same value {QIAjisqR) at the charged sphere’s surface, r = 
R, so that the electric field is continuous. 

In order to underline the importance of the last fact, let us consider one more elementary but very 
important example of the Gauss law’s application. Let a thin plane sheet (Fig. 4) be charged uniformly, 
with an areal density <j= const (see Footnote 9 above). 


/ 

+ z 

' i 

k E 




a 

1 



A ~ z 

1 

r E 



Fig. 1.4. Electric field of a charged plane. 


In this case, it is fruitful to use the Gauss volume in the form of a planar “pillbox” of thickness 
2 z (where z is the Cartesian coordinate perpendicular to charged plane) and certain area A - see Fig. 4. 
Due to the symmetry of the problem, it is evident that the electric field should be: (i) directed along axis 
z, (ii) constant on each of the upper and bottom sides of the pillbox, (iii) equal and opposite on these 
sides, and (iv) parallel to the side surfaces of the box. As a result, the full electric field flux through the 
pillbox surface is just 2AE(z), so that the Gauss law (16) yields 

2 AE(z) = —Q a =—oA, (1.23) 

and we get a very simple but important formula 

E(z) = -2— = const. (1-24) 

2s 0 

Notice that, somewhat counter-intuitively, the field magnitude does not depend on the distance 
from the charged plane. From the point of view of the Coulomb law (5), this result may be explained as 
follows, the farther the observation point from the plane, the weaker the effect of each elementary 
charge, dQ = era r, but the more such elementary charges give contributions to the vertical component of 
vector E. 

Note also that though the magnitude E = |E I of the electric field is constant, its vertical 
component E z changes sign at z = 0 (Fig. 4), experiencing a discontinuity Oump) equal to A E z = ct/sq. 
This jump disappears if the surface is not charged (cr = 0). This statement remains true in a more 
general case of finite volume (but not surface!) charge density p. Returning for a minute to our charged 
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sphere problem, very close to its surface it may be considered planar, so that the electric field should 
indeed be continuous, as it is. 

Admittedly, the integral form (16) of the Gauss law is immediately useful only for highly 
symmetrical geometries, like as in the two problems discussed above. However, it may be recast into an 
alternative, differential form whose field of useful applications is much wider. This form may be 
obtained from Eq. (16) using the divergence theorem that, according to the vector algebra, is valid for 
any space-differentiable vector, in particular E, and for any volume V limited by closed surface S: 13 

|E„c/V = |(V-E)jV, (1.25) 

s v 


where V is the del (or “nabla”) operator of spatial differentiation. 14 Combining Eq. (25) with the Gauss 
law (16), we get 



P 


d J r = 0. 


'o 7 


(1.26) 


For a given distribution of electric charge (and hence of the electric field), this equation should be valid 
for any choice of volume V. This can hold only if the function under the integral vanishes at each point, 


i.e. if 15 


Inhomo- 

geneous 

Maxwell 

equation 

forE 


Homo- 


V • E = — . 


(1.27) 


Note that in a sharp contrast with the integral form (16), Eq. (26) is local : it relates the electric 
field divergence to the charge density at the same point. This equation, being the differential form of the 
Gauss law, is frequently called (the free-space version of) one of Maxwell equations. Another, 
homogeneous Maxwell equation’s “embryo” may be obtained by noticing that curl of point charge’s 
field, and hence that of any system of charges, equals zero: 16 


geneous 

Maxwell 

equation 

forE 


V x E = 0 . 


(1.28) 


(We will arrive at two other Maxwell equations, for the magnetic field, in Chapter 5, and then generalize 
all the equations to their full, time-dependent form by the end of Chapter 6. However, Eq. (27) would 
stay the same.) 


Just to get a better gut feeling of Eq. (27), let us apply it to the same example of a uniformly 
charged sphere (Fig. 2). The vector algebra tells us that the divergence of a spherically symmetric vector 
function E(r) = E(r) n r may be simply expressed in spherical coordinates: 17 


13 See, e.g., MA Eq. (12.2). Note that the scalar product under the integral in Eq. (25) is nothing more that the 
divergence of vector E - see, e.g., MA Eq. (8.4). 

14 See, e.g., MA Secs. 8-10. 

15 In the Gaussian units, just as in the initial Eq. (5), So has to be replaced with 1/4 n, so that the Maxwell 
equation (27) looks like V-E = 4 np, while Eq. (28) stays the same. 

16 This follows, for example, from the direct application of MA Eq. (10.11) to the spherically-symmetric vector 
function f = E(r) = E(r) n, field of a point charge placed at the origin, giving fe=f<p = 0 and df/dd= df,Jd<p= 0. 

17 See, e.g., MA Eq. (10.10) for this particular case (when d!dd= d/d(p= 0). 
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V.E = ~(r 2 E). 
r dr 

As a result, Eq. (27) yields a linear, ordinary differential equation for the function E(r ): 

J_^_ (r 2 E)= \P /£ o’ for r < R, 
r 2 dr [ 0, forr>A, 

that may be readily integrated on each of the segments: 


E(r) = 


1 1 


r‘ 


[ p\r 2 dr = pr 2 /3 + Cj, 

k. 


for r < R , 
for r > R. 


(1.29) 


(1.30a) 


(1.30b) 


In order to determine the integration constant C\, we can use boundary condition E{ 0) = 0. (It follows 
from problem’s spherical symmetry: in the center of the sphere, electric field has to vanish, because 
otherwise, where would it be directed?) Constant C 2 may be found from the continuity condition E(R - 
0) = E(R + 0), which has already been discussed above. As a result, we arrive at our previous results 
(19) and (22). 

We can see that in this particular, highly symmetric case, using the differential form of the Gauss 
law is more complex than its integral form. (For our second example, shown in Fig. 4, it would be even 
less natural.) However, Eq. (27) and its generalizations are more convenient for asymmetric charge 
distributions, and invaluable in the cases where the charge distribution p( r) is not known a priori and 
has to be found in a self-consistent way. (We will start discussing such cases in the next chapter.) 


1.3. Scalar potential and electric field energy 


One more help for solving electrostatics (and more complex) problems may be obtained from the 
notion of the electrostatic potential, which is just the electrostatic potential energy U of a probe particle, 
normalized by its charge: 



(1.31) 


Electro- 

static 

potential 


As we know from classical mechanics, 18 the notion of U (and hence (/>) make sense only for the case of 
potential forces, for example those depending just on particle’s position. Equations (6) and (8) show 
that, in the static situations, the electric field clearly falls into this category. For such a field, the 
potential energy may be defined as a scalar function U( r) that allows the force to be calculated as its 
gradient (with the opposite sign): 


F = -V U . 


(1.32) 


Dividing both sides of this equation by the charge of the probe particle, and using Eqs. (5) and (31), we 
get 19 


18 See, e.g., CM Sec. 1.4. 

19 Eq. (28) could be also derived from this relation, because according to vector algebra, any gradient field has 
vanishing curl - see, e.g., MA Eq. (11.1). 
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E = -V (f> . 


(1.33) 


In order to calculate the scalar potential, let us start from the simplest case of a single point 
charge q placed at the origin. For it, the Coulomb law (5) takes a simple form 


E = 


1 r 

— 

4 ns n r 


1 n,. 

y~2 

4 ns,, r 


(1.34) 


It is straightforward to check that the last fraction in the right-hand part of this equation is equal to - 
V(l/r). 20 Hence, according to the definition (33), for this particular case 

Potential of a 
point charge 


(In the Gaussian units, this result is spectacularly simple: (f) = q!r.) Note that we could add an arbitrary 
constant to this potential (and indeed to any other distribution of (f> discussed below) without changing 
the force, but it is convenient to define the potential energy to approach zero at infinity. 

Before going any further, let us demonstrate how useful the notions of U and <f) are, on a very 
simple example. Let two similar charges q be launched from afar, with an initial velocity vo « c each, 
straight toward each other (i.e. with the zero impact parameter) - see Fig. 5. Since, according to the 
Coulomb law, the charges repel each other with increasing force, they will stop at some minimum 
distance r mm from each other, and than fly back. 



(1.35) 


m , q 


= ? 


m,q 


Fig. 1.5. Simple problem of electric particle motion. 


We could of course find r mm directly from the Coulomb law. However, for that we would need to 
write the 2 nd Newton law for each particle (actually, due to the problem symmetry, they would be 
similar), then integrate them over time once to find the particle velocity v as a function of distance, and 
then recover r 1 T 1 j n from the requirement v = 0. The notion of potential allows this problem to be solved in 
one line. Indeed, in the field of potential forces the system’s total energy E=T+U=T+q<f> is 
conserved. In our nonrelativistic case, the kinetic energy T is just mv /2. Hence, equating the total 
energy of two particles in the points r = oo and r = r mm , and using Eq. (35) for (j), we get 


2 



+ 0 = 0 + - 




4 ^„ finin 


(1.36) 


2 2 

immediately giving us the final answer: r mm = q~l4nC{ ] mv{ ] . 


Now let us calculate (j) for an arbitrary configuration of charges. For a single charge in an 
arbitrary position (say, ly), r in Eq. (35) should be evidently replaced for | r - ry | . Now, the linear 


20 This may be done either by Cartesian components or using the well-known expression V/= (dfJdr)n r valid for 
any spherically-symmetric scalar function/)/-) - see, e.g., MA Eq. (10.8) for the particular case d/dd= d!d(p= 0. 
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superposition principle (3) allows for an easy generalization of this formula to the case of an arbitrary 
set of discrete charges, 


7- 

iy,*r | r ■> 


(1.37) 


Finally, using the same arguments as in Sec. 1, we can use this result to argue that in the case of an 
arbitrary continuous charge distribution 


A(y\ _ 1 f 

P( V ') j3_, 

^ 4 7l£, J 

r r ' 



(1.38) 


Again, the notion of Dirac’s delta-function allows to use the last equation for discrete charges as well, so 
that Eq. (38) may be considered as the general expression for the electrostatic potential. 


For most practical calculations, using this expression and then applying Eq. (33) to the result, is 
preferable to using Eq. (9), because ^ is a scalar, while E is a 3D vector - mathematically equivalent to 3 
scalars. Still, this approach may lead to technical problems similar to those discussed in Sec. 2. For 
example, applying it to the spherically-symmetric distribution of charge (Fig. 2), we get integral 


0 = 


1 

4tt£ 0 


71 CO 

2 7Z-J sin 0'dP'j r ,2 dr’ 

o o 


p{f) 


cos 0 , 


(1.39) 


which is not much simpler than Eq. (11). 

The situation may be much improved by re-casting Eq. (38) into a differential form. For that, it is 
sufficient to plug the definition of (j), Eq. (33), into Eq. (27): 


v .(-v0 = A 

^0 


(1.40) 


The left-hand part of this equation is nothing more than the Laplace operator of <f) (with the minus sign), 
so that we get the famous Poisson equation 21 for the electrostatic potential: 

(1.41) 

A Tip.) This differential equation is so 



(In the Gaussian units, the Poisson equation looks like V“0 = ■ 
convenient for applications that even its particular case for p = 0, 


vV = 0, 


(1.42) 


has earned a special name - the Laplace equation. 22 

In order to get a feeling of the Poisson equation as a problem solving tool, let us return to the 
spherically-symmetric charge distribution (Fig. 2) with a constant charge density p. Using the 


21 Named after S. D. Poisson (1781-1840), also famous for the Poisson distribution - one of the central results of 
the probability theory - see, e.g., SM Sec. 5.2. 

22 After mathematician (and astronomer) P. S. de Laplace (1749-1827) who, together with A. Clairault, is credited 
for the development of the very concept of potential. 


Potential 
of a charge 
distribution 


Poisson 
equation 
for <1 


Laplace 
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symmetry, we can present the potential as <p(r) = (fir), and hence use the following simple expression for 
its Laplace operator: 23 


v ^ = -- 
r dr 


1 d ( , dtp'' 


dr 


so that for the points inside the charged sphere (r < R ) the Poisson equation yields 


_L d_ 

r 2 dr 


d(f) 

dr 


P 


i.e. 


d_ 

dr 


d(f) 

dr 


= -^r 2 . 


(1.43) 


(1.44) 


Integrating the last form of the equation over r once, with the natural boundary condition d(f)ldr | , =o 
(because of the condition £(0) =0, which has been discussed above), we get 


dtp 

— (/) = 2 

dr r s 


P 


-jr ,2 dr' = 

o o 


pr 

3 s n 


1 Qr 
4ns 0 R 3 


(1.45) 


Since this derivative is nothing more than —E(r), in this formula we can readily recognize our previous 
result (22). Now we may like to carry out the second integration to calculate the potential itself: 


<P(r) = ~ 


Q 


4ns 0 R 0 


J r’dr’ + c j 


Qr 2 


8 ns 0 R 


■ + c v 


(1.46) 


Before making any judgment on the integration constant c\, let us solve the Poisson equation (in 
this case, just the Laplace equation) for the range outside the sphere (r> R): 


1 d f 2 dtp 7 


r 2 dr 


dr 


= 0 . 


Its first integral, 


dr r 


(1.47) 


(1.48) 


also gives the electric field (with the minus sign). Now using Eq. (1.45) and requiring the field to be 
continuous at r = R, we get 


0 


dip 

i.e. - J -(r) = - 
R A 4 7 ts 0 R^ dr 

in an evident agreement with Eq. (19). Integrating this result again 

,, . Q r dr Q 


Q 


4ns ^r 


4ns 0 r 


for r > R , 


(1.49) 


(1.50) 


we can select C3 = 0, so that (pQc) = 0, in accordance with the usual (though not compulsory) convention. 
Now we can finally determine constant c\ in Eq. (46) by requiring that this equation and Eq. (50) give 
the same value of (p at the boundary r = R. (According to Eq. (33), if the potential had a jump, the 
electric field at that point would be infinite.) The final answer may be presented as 


23 See, e.g., MA Eq. (10.8) for d!dd= d!dq>= 0. 
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<K r ) = 


Q R 2 ~r 2 

4 7T£ 0 R 2 R 2 


+ 1 , 


for r < R. 


(1.51) 


We see that using the Poisson equation to find the electrostatic potential distribution for highly 
symmetric problems may be more cumbersome than directly finding the electric field - say, from the 
Gauss law. However, we will repeatedly see below that if the electric charge distribution is not fixed in 
advance, using Eq. (41) may be the only practicable way to proceed. 

Returning now to the general theory of electrostatic phenomena, let us calculate potential energy 
U of an arbitrary system of electric charges cjk. Despite the apparently straightforward relation (31) 
between U and <j), the calculation is a little bit more complex than one might think. Indeed, let us rewrite 
Eqs. (32), (33) for a single charge in the integral form: 

r r 

U (r) = -J F(r') • dr', i.e. (/>(r) = -jE(r') • dr' , (1.52) 

r„ r 0 


where r 0 is some reference point. These integrals reflect the fact that the potential energy is just the 
work necessary to move the charge from point r 0 to point r, and clearly depend on whether the charge 
motion affects force F (and hence electric field E) or not. If it does not, i.e. if the field is produced by 
some external charges (such fields E ext are also called external ), everything is simple indeed: using the 
linearity of relations (31) and (32), for the total potential energy we may write 

r 

^ext =Z?*&xt( r *)> where 0 ext (r ) = - J E ext (r ') • c/r '. (1.53) 

k r„ 


Repeating the argumentation that has led us to Eq. (9), we see that for a continuously distributed charge, 
this sum turns into an integral: 


U ext = \ p(r)<f> c Jr)d^ r . 


(1.54) 


However, if the electric field is created by the charges whose energy we are calculating, the 
situation is somewhat different. To calculate U for this case, let us use the fact its independence of the 
way the charge configuration has been created, considering the following process. First, let us move one 
charged particle (say, q\) from infinity to an arbitrary point of space (iq) in the absence of other charges. 
During the motion the particle does not experience any force (again, the charge does not interact with 
itself!), so that its potential energy is the same as at infinity (with the standard choice of the arbitrary 
constant, zero): U\ = 0. Now let us fix the position of that charge, and move another charge (c/ 2 ) from 
infinity to point r? (with velocity v « c, in order to avoid any magnetic field effects, to be discussed in 
Chapter 5.) This particle, during its motion, does experience the Coulomb force exerted by fixed q 1 , so 
that according to Eq. (3 1), its contribution to the final potential energy 


^2 — ^ 2^1 (* 2 ) • 


(1.55) 


Since the first particle was not moving during this process, the total potential energy U of the system is 
equal to just U 2 . This is exactly the equality used for writing the right-hand part of Eq. (36). (Prescribing 
a similar energy to charge q\ as well would constitute an error - a very popular one, and hence having a 
special name, double-counting .) 


Energy 

in 

external 

field 
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Now, fixing the first two charges in points ri and r 2 , respectively, and bringing in the third 
charge from infinity, we increment the potential energy by 

U i =q i \&{r i ) + <h(r 3 )] ( 1 . 56 ) 

I believe that at this stage it is already clear how to generalize this result to the contribution from an 
arbitrary (&-th) charge being moved in (Fig. 6): 

U k = q k [fa (r k ) + (f) 2 (r* ) + fa (r* ) + ... + <j) k _ x (r k )] = q k £ fa k (r, ) . (1.57) 

k'<k 

(Notice condition k’ < k, which suppresses erroneous double-counting.) 




from co 


c h > r ; / 

• / 

[q k ^ r k\ k ' <k 


Fig. 1.6. Deriving the potential energy 
of a system of electric charges. 


Now, summing up all the increments, for the total electrostatic energy of the system we get: 

(158) 

k k,k' 

(*'<*) 

This is our final result in its generic form; it is so important that is worthy of rewriting it in two other 
forms. First, for its generalization to the continuous charge distribution, we may use Eq. (35) to present 
Eq. (58) in a more symmetric form: 

U = —T^~. (1.59) 

4 ^o kx K-- r A-' 

(k'<k) 

The expression under the sum is evidently symmetric with respect to the index swap, so that it may be 
rewritten in a fully symmetric form, 

u = —~ V 

4 ne 0 2*? r, -r k ,\ 

(k'*k) 

which is now easily generalized to the continuous case: 

C/=N_i( rf 3 r f rf 3 r ,PWP«. 

4 ^ 0 2 J J | r_r 1 

(As before, in this case the restriction expressed in the discrete charge case as k ^ k’ is not important, 
because if the charge density is a continuous function, integral (61) does not diverge at point r = r’.) 


(1.60) 


(1.61) 
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To present this result in one more form, let us notice that according to Eq. (38), the integral over 
r’ in Eq. (61), divided by 4tt£o, is just the full electrostatic potential at point r, and hence 

Charge 

(1.62) interaction 
energy 

For the discrete charge case, this result becomes 

u = \ 2>^(r t ), (1.63) 

2 k 

but now it is important to remember that the “full” potential’s value (jirk) should exclude the (infinite) 
contribution of charge k itself. Comparing the last two formulas with Eqs. (52) and (53), we see that the 
electrostatic energy of charge interaction, as expressed via the charge-potential product, is twice less 
than that of charge energy in a fixed (“external”) field. This is evidently the result of the self-consistent 
build-up of the electric field as the charge system is being formed. 24 

Now comes an important conceptual question: can we locate this interaction energy in space? 
Expressions (60)-(63) seem to imply that contributions to U come only from the regions where electric 
charges are located. However, one of the beautiful features of physics is that sometimes completely 
different interpretations of the same mathematical result are possible. In order to get an alternative view 
at our current result, let us write Eq. (62) for a volume V so large that the electric field on the limiting 
surfaced is negligible, and plug into it the charge density expressed from the Poisson equation (41): 


U = ~\p{r)<l){r)d’r. 


U = 


~{ </>V 2 </d 3 r . 
^ v 


This expression may be integrated by parts as 25 


U = 



§<f>(V<f>) n d 2 r-\(V0) 2 d 3 r . 

A V 


(1.64) 


(1.65) 


According to our condition of negligible field E = -V^ on the surface, the first integral vanishes, and we 
get a very important formula 


[/ = ^Lj(V^) 2 dV = ^-$E 2 d 3 r. (1.66) 

This result certainly invites an interpretation very much different than Eq. (62): it is natural to 
represent it in the following form: 

Electric 

(1.67) field 

energy 


U = \u(r)d 2 r, with u(r) = — E 2 (r), 
J 2 


24 The nature of this additional factor l A is absolutely the same as in the well-known formula U = (Vij/cc 2 for the 
potential energy of an elastic spring providing returning force F = -kx proportional to the deviation x from 
equilibrium. 

25 This transformation follows from the divergence theorem MA (12.2) applied to vector function f = (jN (j), taking 
into account the 3D differentiation rule MA Eq. (1 1.4a): V-{(f)V (f>) = (V^)-(V^) + ^V-(V^) = (V^) 2 + <jN 2 <f>. 
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and interpret u( r) as the spatial density of the electric field energy, 26 which is continuously distributed 
over all the space where the field exists - rather than just its part where the charges are located. 

Let us have a look how these two alternative pictures work for our testbed problem, a uniformly 
charged sphere. If we start from Eq. (62), we may limit integration by the sphere volume (0 < r < R) 

3 2 

where p ^ 0. Using Eq. (5 1), and the spherical symmetry of the problem (d r = A nr dr), we get 


U = 


1 R 1 O R 

— 4n( pd)r 2 dr = —Anp f 

9 J 9 Attc R J 


AtT£ q r 0 


R 2 -r 2 

2 R 2 


+ 1 


\r 2 dr 


6 1 Q 2 

5 Ans 0 R 2 


( 1 . 68 ) 


On the other hand, if we use Eq. (67), we need to integrate energy everywhere, i.e. both inside 
and outside of the sphere: 


U = 


— An! 
2 


| E 2 r 2 dr + J E 2 r 2 dr 


(1.69) 


Using Eqs. (19) and (22) for, respectively, the external and internal regions, we get 


U = ^4n 
2 


— 

ol 4 ^oy 


2 dr + \ 


Q 


R\47ts 0 r- , 


r dr 


- + 1^ 

V5 j 


1 0- 


Ans 0 R 2 


(1.70) 


This is (fortunately :-) the same answer as given by Eq. (68), but to some extent it is more informative 
because it shows how exactly the electric field energy is distributed between the interior and exterior of 
the charged sphere. 27 

We see that, as we could expect, within the realm of electrostatics, Eqs. (62) and (67) are 
equivalent. However, when we examine electrodynamics in Chapter 6 and on, we will see that the latter 
equation is more general, and that it is more adequate to associate energy with the field itself rather than 
its sources - in our current case, electric charges. 


1 .4. Exercise problems 

1.1. Calculate the electric field created by a thin, long, straight filament, electrically charged with 
a constant linear density X, using two approaches: 

(i) directly from the Coulomb law, and 

(ii) using the Gauss law. 

1.2 . Two thin, straight parallel filaments, separated by distance p, carry 
equal and opposite uniformly distributed charges with linear density X - see Fig. 
on the right. Calculate the electrostatic force (per unit length) of the Coulomb 


26 In the Gaussian units, the standard replacement £q — > 1/4 /r turns the last of Eqs. (67) into u(r) = zC/'S n. 

27 Note that U — » co at R — > 0. Such divergence appears at application of Eq. (67) to any point charge. Since it 
does not affect the force acting on the charge, the divergence does not create any technical difficulty for analysis 
of charge statics or nonrelativistic dynamics, but it points to a conceptual problem of classical electrodynamics as 
the whole. This issue will be discussed in the very end of the course (Sec. 10.6). 


-X 
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interaction between the wires. Compare the result with the Coulomb law for the force between the point 
charges, and interpret their difference. 


1.3 . A sphere of radius R, whose volume had been charged with a constant density p, is split with 
a very narrow, planar gap passing through its center. Find the Coulomb force between the resulting two 
hemispheres. 

1.4 . Calculate the distribution of the electrostatic potential created by a straight, thin filament of 
finite length 21, charged with a constant linear density X, and explore the result in the limits of very 
small and very large distances from the filament. 


1.5 . A thin plane sheet, perhaps of an irregular shape, carries an electric charge distributed over 
the sheet with a constant areal density cr. 

(i) Express the electric field component normal to the plane, at a certain distance from it, via the 
solid angle Q at which the sheet is visible from the observation point. 

(ii) Use the result to calculate the field in the center of a cube, with one face charged with 
constant density cr. 


1.6 . Can one create electrostatic fields with the Cartesian components proportional to the 
following products of Cartesian coordinates {x,y, z}, 

(i) {yz, xz, xy], 

(ii) {xy, xy, yz }, 

in a finite region of space? 


1.7 . Distant sources have been used to create different 
electric fields on two sides of a wide and thin metallic membrane 
with a round hole of radius R in it - see Fig. on the right. Besides E, 
the local perturbation created by the hole, the fields are uniform: 

\E X , at z < 0, 

E \r»R=n z x\ 

[E 2 , atz>0. 

Prove that the system may serve as an electrostatic lens for 
charged particles flying along axis z, at distances p « R from it, ^ 
and calculate the focal distance / of the lens. Spell out the 
conditions of validity of your result. 


-> 

P 


1.8 . By direct calculation, find the average electric potential of the spherical surface of radius R, 
created by a point charge q located at distance r> R from the sphere’s center. Use the result to prove the 
following general mean value theorem : the electric potential at any point is always equal to its average 
value on any spherical surface with the center at that point, and containing no electric charges inside it. 

1.9 . Calculate the electrostatic energy per unit area of the system of + cr 

two thin, parallel planes with equal and opposite charges of a constant areal j 

density cr, separated by distance d - see Fig. on the right. _ G V 
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1.10 . The system analyzed in the previous problem (two thin, 
parallel, oppositely charged planes) is now placed into an external, 
uniform, nonnal electric field E ext = d a, - see Fig. on the right. Find the 
forces (per unit area) acting on each plane, by two methods: 

(i) directly from the electric field distribution, and 

(ii) from the potential energy of the system. 



1.11 . A thin spherical shell of radius R, which had been charged with a constant areal density cr, 
is split into two equal halves by a very narrow, planar cut passing through sphere’s center. Calculate the 
force of electrostatic repulsion between the resulting hemispheric shells. 


1.12 . Two similar thin, circular, coaxial disks of radius R, separated 
by distance 2d, are uniformly charged with equal and opposite areal densities 
±<j - see Fig. on the right. Calculate and sketch the distribution of the 
electrostatic potential and the electric field of the disks along their common 
axis. 


1.13 . In a certain reference frame, the electrostatic potential created 
by some electric charge distribution, is 



A r )= c 


1 1 

1 

r 2k 


expi 


0 J 


where C and ro are constants, and r = |r| is the distance from the origin. Calculate the charge distribution 
in space. 


1.14 . A thin flat sheet, cut in a form of a rectangle of size axb, is electrically charged with a 
constant areal density cr. Without an explicit calculation of the spatial distribution <p{ r) of the 
electrostatic potential induced by this charge, find the ratio of its values at the center and at the corners 
of the rectangle. 

Hint: Consider partitioning the rectangle into several similar parts and using the linear 
superposition principle. 

1.15 . Explore the relation between the Laplace equation (42) and the condition of minimum of 
the electrostatic field energy (67). 

1.16 . Calculate the energy of electrostatic interaction of two spheres, of radii Ri and R 2 , each 
with a spheric ally-symmetric charge distribution, separated by distance d> R\+ Ri. 
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1.17 . Prove the following reciprocity theorem of electrostatics : 28 if two spatially-confined charge 
distributions p\(r) and pff) create respective distributions and (f(r) of the electrostatic potential, 
then 

J Pi(r)A( r )^ 3 r = J p 2 ( r)^(r)df . 

Hint : Consider integral J E, -E 2 d 3 r . 


28 This is only the simplest one of the whole family of reciprocity theorems in electromagnetism. ( Sometimes it is 
called "Green's reciprocity theorem", but historically it is more fair to reserve the last name for the generalization 
to surface charges, using Eq. (2.210), to be discussed in Sec. 2. 7 below.) 
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Chapter 2. Charges and Conductors 

In this chapter I will start addressing the ( very common) situations when the electric charge distribution 
in space is not known a priori, but rather should be calculated in a self-consistent way together with the 
electric field it creates. The simplest situations of this kind in volve conductors, and lead to the so-called 
boundary problems in which partial differential equations are solved with appropriate boundary 
conditions. Such problems are also broadly used in other parts of electrodynamics ( and indeed in other 
fields of physics as well), so that following tradition, I will use this chapter ’s material as a playground 
for a discussion of various methods of boundary problem solution, and the special functions most 
frequently encountered on this way. 


2.1. Electric field screening 


The basic principles of electrostatics outlined in Chapter 1 present the conceptually full solution 
for the problem of finding electric field (and hence Coulomb forces) induced by a charge distribution, 
for example, charge density p(r). However, in most practical situation this function is not known but 
should be found self-consistently with the field. The conceptually simplest case of this type arises when 
certain point charges qk are placed near a surface of a good conductor, e.g., a metal: the electric field of 
these charges induces additional charges at conductor’s surface, which also contribute to the field. 
Another important type of problems are those without space-positioned charges at all; here only the total 
charges of the involved conductors are fixed, but their spatial distribution inside each conductor has to 
be found. The full solution of such problems, of course, should satisfy Eq. (1.5) for the total field and 
total set of charges. 


To approach the problems, 1 need to discuss, if only very briefly, 1 the relevant physics of 
conductors. In the simplest macroscopic model, conductors are treated as materials having internal 
charged particles (e.g., electrons in metals) that are free to move under the effect of force - in particular, 
the force F = qSL exerted by electric field E. In electrostatics (which specifically excludes the case dc 
current, to be discussed in Chapter 4 below), there should be no such motion, so that everywhere inside 
the conductor the electric field should vanish: 


E = 0. 


(2.1a) 


This is the electric field screening 2 effect. According to Eq. (1.33), this condition may be rewritten in 
another, frequently more convenient form: 


(f> = const ; 


(2.1b) 


note, however, that if a problem includes several unconnected conductors, the constant in Eq. (lb) may 
be different for each of them. 


1 More detailed discussions may be found, e.g., in Sec. 13.5 of J. Hook and H. Hall, Solid State Physics, 2 nd ed., 
Wiley, 1991, or the section on electric field screening in Chapter 17 of N. Ashcroft and N. Mermin, Solid State 
Physics, Brooks Cole, 1976. 

2 This term, used for electric field, should not be confused with shielding - the word used for the description of 
magnetic field reduction by magnetic materials - see Chapter 5 below. 


© 2013-2016 K. Likharev 


Essential Graduate Physics 


EM: Classical Electrodynamics 


Now let us examine what we can say about the electric field outside a conductor, within the same 
macroscopic model. At close proximity, any smooth surface (in our case that of a conductor) looks 
planar. Let us integrate Eq. (1.28) over a narrow ( d « l) rectangular loop C encircling a part of such 
plane conductor’s surface (see the dashed line in Fig. 1), and apply to the electric field the well-known 
vector algebra equality - the Stokes theorem 3 

|(VxE)„r/ 2 r = |E-Jr, (2.2) 

s c 

where S is the surface limited by contour C, in our case dominated by two straight lines of length /. This 
means that if l is much smaller that the characteristic scale of field change, the right-hand part of Eq. (2) 
equals [( A r ) m - (E T ) 0Ui ]L where E r is field’s component parallel to the surface. On the other hand, 
according to Eq. (1.28), the left-hand side of Eq. (2) equals zero. Hence, E r should be continuous at the 
surface, and in order to satisfy Eq. (la) inside the conductor, immediately outside it, E r = 0 as well. 



► 

E 

► 


> 


> 


conductor free space 


Fig. 2.1. Electric field near conductor’s surface: 
E z = 0, E n = at S q. 


Hence, the field just outside the conductor has be normal to its surface. In order to find this 
normal field, let us apply the Gauss law (1.16) to a plane pillbox of area A, similar to the one discussed 
in Sec. 1.2 - see Fig. 1.4. Due to Eq. (1), the total electric flux through the pillbox walls is now (E n ) ouX A , 
so that for this surface field we get 

Surface 
charge 
density 

where cr is the areal density of conductor’s surface charge. So, the normal component of the field is 
related to the surface charge density by the universal relation (3). 

For the electrostatic potential the macroscopic model provides an even more simple result. 

Indeed, applying the latter of integrals (1.52) to a short path d across the surface normal to it, we see that 
since E n is finite, the potential change A <f) vanishes as d — > 0. Hence Eq. (lb) is also valid for potential’s 
value immediately outside conductor’s surface. 

Before starting to use the macroscopic model for solution of particular problems of electrostatics, 
let us briefly discuss its limitations. Since the argumentation leading to Eq. (3) is valid for any thickness 
d of the Gauss pillbox, within the macroscopic model, the surface charge is located within an infinitely 


1 

° = G) ( E n L = ~ e o (V <£)„ = -s 0 — 

on 


3 See, e.g., MA Eq. (12.1). 
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thin surface layer. This is of course impossible physically: for one, this would require an infinite volume 
density p of charge. In reality the charged layer (and hence the region of electric field’s crossover from 
the finite value (3) to zero) has a nonvanishing thickness X. At least three effects contribute to X: 

(i) Atomic structure of matter. Within each atom, the electric field does exist and is highly non- 
uniform. Thus Eq. (1) is valid only for the spatial average of the field in a conductor, and cannot be 
taken seriously on the atomic scale ao ~ 10" 10 m. 4 

(ii) Thermal excitation. In conductor’s bulk, the number of protons of atomic nuclei (n) and 
electrons ( n e ) and per unit volume are balanced, so that the net charge density, p = e(n - n e ), vanishes. 5 
However, if an external electric field penetrates a conductor, electrons can shift in or out of its affected 
part, depending on the field addition to their potential energy, AU = q e (f) = -e<f). (Here the arbitrary 
constant in (j) is chosen to give <j> = 0 inside the conductor.) In classical statistics, this change is 
described by the Boltzmann distribution: 6 

n e ( r ) = n expi - ( 2 - 4 ) 

l B 


23 

where k b « 1.38x10"" J/K is the Boltzmann constant, and T is temperature in SI units (kelvins). As a 
result, the net charge density is 


p( r ) 


= en 


1 - cxp< 


g^(r) r 
k*T Jj 


(2.5) 


If the field did not move the atomic nuclei at all, we could plug the last formula directly into the Poisson 
equation (1.49). Actually, the penetrating electric field shifts the average charge of the nuclei as well. As 
will be discusses in the next chapter, this results in the reduction of the electric field by a media-specific 
dimensionless factor s r (typically not too different from 1), called the dielectric constant. As a result, the 
Poisson equation takes the form, 7 


d 2 <j> 
dz 2 


p 

en 1 

( 

J 


exp 

- 

kj 

l o J 

e r e o 

e r e o 1 

V 1 


\ 

1 , 

7 


(2.6) 


where we have taken advantage of the ID geometry of the system to simplify the Laplace operator, with 
axis z normal to the surface. Even with this simplification, Eq. (6) is a nonlinear differential equation 
allowing an analytical but rather bulky solution. Since our current goal is just to estimate of the field 
penetration depth X, let us simplify the equation further by considering the low-field limit: e\(j\~ e\E\X 
« k\>,T. In this limit we can extend the exponent into the Taylor series, and limit ourselves to the two 
leading terms (of which the first one cancels with the unity). As a result, Eq. (6) becomes linear, 


d~(h en e(h d (b 1 , 

— ~ = , i.e . — - = 

dz~ ££ 0 k B T dz~ X 


(2.7) 


4 This scale originates from the quantum-mechanical effects of electron motion, characterized by the Bohr radius 
r B ~ 0.5xlO' 10 m - see, e.g., QM Eq. (1.13). 

5 Here e denotes the positive fundamental charge, e ~ 1.6xl0" 19 C, so that the electron charge equals (-e). 

6 See, e.g., SM Sec. 3.1. 

7 This equation and/or its straightforward generalization to the case of charged particles (ions) of several kinds is 
frequently (especially in the theories of electrolytes and plasmas) called the Debye-Hiickel equation. 
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where constant A in this case is equal to the so-called Debye screening length A D , defined by relation 


2 _ £ r £ ok B T 

An = ' 


e 2 n 


(2.8) 


Equation (7) is easy to solve: it describes an exponential decrease of the electric potential, with 
the characteristic length /l D : ()> oc exp { -z!A n } . Plugging in the fundamental constants So, e, and k B , we get 
the following estimate: 2n[m] ~ 70 (s r T[K]/n[m' ]) . According to this formula, in semiconductors at 
room temperature, the Debye length may be rather substantial. For example, in silicon ( s r ~ 12) doped to 
the charge carrier concentration n = 3x10 m‘ (the value typical for modern integrated circuits), 8 A D « 2 
nm, still well above the atomic size scale ao- However, for typical good metals (n -10“ in' , s r - 10) 
the same fonnula gives an estimate A D ~ 4x10'* 1 m, less than a 0 - In this case Eq. (8) should not be taken 
too literally, because it is based on the assumption of continuous charge distribution. 


(iii) Quantum statistics. Actually, the last estimate is not valid for good metals (and highly doped 
semiconductors) for one more reason: their free electrons obey quantum ( Fermi-Dirac ) statistics rather 
that the Boltzmann distribution (4). 9 As a result, at all realistic temperatures they fonn a degenerate 
quantum gas, occupying all available energy states below certain level 3\ » k B T called the Fermi 
energy. In these conditions, the screening of relatively low electric field 10 may be described by replacing 
Eq. (5) with 

p = e(n-n e ) = -eg(# F )(-[/) = -e 2 g(£ F )</>, (2.9) 


where g(F) is the density of quantum states (per unit volume) at electron’s energy A At the Fermi 
surface, the density is of the order of nl & F - n As a result, we again get the second of Eqs. (7), but with a 
different characteristic scale A, defined by the following relation: 


£ 


£ r £ 0 ~ S r £ O^F 

e 2 g(# F ) e 2 n 


(2.10) 


29 3 

and called the Thomas-Fermi screening length. Since for most good metals, n is of the order of 10 m' , 
and Ap is of the order of 10 eV, Eq. (10) typically gives 2 tf close to a few ao, and makes the Thomas- 
Fermi screening theory valid at least semi-quantitatively. 


To summarize, the electric field penetration into good conductors is limited to a depth A ranging 
from fractions of a nanometer to a few nanometers, so that for problems with the characteristic size 
much larger than that scale, the macroscopic boundary conditions (1) give a very good accuracy, and we 
will use them in the rest of this chapter. However, the reader should remember that in some situations 


8 There is a good reason for making an estimate of A D for this case: the electric field created by the gate electrode 
of a field-effect transistor, penetrating into doped silicon by a depth ~A D , controls current in this most important 
electronic device - on whose back all the current information revolution rides. Because of that, A D establishes the 
possible scale of semiconductor circuit shrinking which is the basis of the well-known Moore’s law. (Practically, 
the scale is determined by integrated circuit patterning techniques, and Eq. (8) may be used to find the proper 
charge carrier density n and hence the level of silicon doping.) 

9 See, e.g., SM Sec. 2.8. For a more detailed derivation of Eq. (10), see SM Chapter 3. 

10 In good metals this equation is valid up to the fields ~ E F !e/i n ~ 10 9 V/m, very high by the usual standards. For 
example, the electric breakdown threshold for vacuum (or air- filled) gaps is ~3xl0 6 V/m. 

11 See, e.g., SM Sec. 3.3. 
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involving semiconductors, as well as at nanoscale experiments with metals, the electric field penetration 
effect should be taken into account. 


2.2. Capacitance 

Let us start with systems consisting of charged conductors alone. Our goal here is calculating the 
distributions of electric field E and potential (f> in space, and the distribution of the surface charge 
density o over the conductor surfaces. However, before doing that for particular situations, let us see if 
there are any integral measures of these distributions, that should be our primary focus. 

The simplest case is of course a single conductor in the otherwise free space. According to Eq. 
(1), all its volume should have a constant electrostatic potential (f>, evidently providing one convenient 
global measure of the situation. Another integral measure is evidently provided by the total charge 

Q = ^ pd 3 r = ^ad 2 r , (2.11) 

v s 


where the latter integral is extended over the whole surface S of the conductor. In the general case, what 
we can tell about the relation between Q and (jft At Q = 0, there is no electric field in the system, and it is 
natural (though not necessary) to select the arbitrary constant in the electrostatic potential to have </>= 0. 
Then, if the conductor is charged with a finite Q, according to the Coulomb law, the electric field in any 
point of space is proportional to Q. Hence the electrostatic potential everywhere, including its value <f) on 
the conductor, is also proportional to Q\ 

(t* = pQ- ( 2 - 12 ) 

The proportionality coefficient p, that depends on the conductor size and shape but not on Q, is called 
the reciprocal capacitance (or, not too often, “electrical elastance”). Usually, Eq. (12) is rewritten in a 
different form, 

Self- 
capacitance 


where C is called self-capacitance. (Frequently, C is called just capacitance, but we will soon see that 
for more complex situations the latter term may be too ambiguous.) 

Before going to calculation of C, let us have a look at the electrostatic energy of a single 
conductor. In order to calculate it, of the several equations discussed in Chapter 1, Eq. (1.63) is most 
convenient, because all elementary charges qk are now parts of the conductor surface charge, and hence 
sit at the same potential (f>. As a result, the equation becomes very simple: 

U = (2- 14 ) 

l k l 



(2.13) 


Moreover, using the linear relation (13), the same result may be re-written in two more forms: 

Electro- 
static 
energy 

We will discuss several ways to calculate C in the next sections, and right now will have a quick 
look at just the simplest example for which we have calculated everything necessary in the previous 



(2.15) 
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chapter: a conducting sphere of radius R. Indeed, we already know the electric field distribution: 
according to Eq. (1), E = 0 inside the sphere, while Eq. (1.19), with Q(r ) = Q, describes the field 
distribution outside it. Moreover, since the latter formula is exactly the same as for the point charge 
placed in the sphere’s center, the potential distribution in space can be obtained from Eq. (1.35) by 
replacing q with sphere’s full charge Q. Hence, on the surface of the sphere (and, according to Eq. (2), 
through its interior), 


<t> = 


i Q 

Ans 0 R 


(2.16) 


Comparing this result with the definition (13), for the self-capacitance we obtain 12 

C = Ans {) R = 2ns {] D, D = 2R. (2.17) 


This formula, which should be well familiar to the reader, is convenient to get some feeling of 
how large the SI unit of capacitance (1 farad, abbreviated as F) is: the self-capacitance of Earth (Re ~ 
6.34xl0 6 m) is below 1 mF! Another important note is that while Eq. (17) is not exactly valid for a 
conductor of arbitrary shape, it implies an important estimate 

C~27T£ 0 a (2.18) 


where a is the scale of the linear size of any conductor. 13 

Now proceeding to a system of two conductors, we immediately see why we should be careful 
with the capacitance definition: one constant C is insufficient to describe such system. Indeed, here we 
have two, generally different conductor potentials, (j)\ and (jh, that may depend on both conductor 
charges, Q\ and (X Using the same arguments as for the one-conductor case, we may conclude that the 
dependence is always linear: 

01 = PuQi +PnQi’ (219 . 

02 — Pl\Q\ P 22 Q 2 ’ 


but still has to be described not with one but with four coefficients • (j,j ’ = 1, 2) fonning the so-called 
reciprocal capacitance matrix 


Pi 1 

P 12 

P 21 

P 22 


(2.20) 


Plugging relation (19) into Eq. (1.63), we see that the full electrostatic energy of the system may be 
expressed by a quadratic form: 


12 In the Gaussian units, using the standard replacement Ajtsq — > 1 , this relation takes a remarkably simple form: C 
= R , good to remember. Generally, in the Gaussian units (but not in the SI system!) the capacitance has the 
dimensionality of length, i.e. is measured in centimeters. Note also that a convenient fractional SI unit, 1 picofarad 
(10' 12 F) is very close to the Gaussian unit: 1 pF = (lxl 0‘ 12 )/(47T£bx 1 0‘ 2 ) ~ 0.8998 cm. 

13 These arguments are somewhat insufficient to say which size should be used for a in the case of narrow, 
extended conductors, e.g., a thin, long wire of length L and diameter D « L. In the Very soon we will see that in 
such cases the electrostatic energy, and hence C, should mostly depend on the larger size of the conductor. 
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Voltage 


Mutual 

capacitance 


u = 



P 12 +P 21 

2 


0102 + 


P 22 

2 



( 2 . 21 ) 


It is evident that the middle tenn in the right-hand part of this equation describes the electrostatic 
coupling of the conductors. (Without it, the energy would be just a sum of two independent electrostatic 
energies of conductors 1 and 2.) This is why systems with \pu\, |/? 2 i | <<: p 11 , P 22 are called weakly 
coupled, and may be analyzed using approximate methods - see, e.g., Fig. 3 and its discussion below. 

Before proceeding further, let us use the Lagrangian formalism of analytical mechanics 14 to 
argue that the off-diagonal elements of matrix pjj • are always equal: 

P \i ~ P 2 \ • ( 2 - 22 ) 


Indeed, charges £> 1,2 may be taken for generalized coordinates qj (j = 1,2) of the system; then the 
corresponding generalized forces may be found as 


8U _ dU 
dq j dQj ' 


Applying this equation to Eq. (21), we see that, for example 


* = 


PuQi + 


P12+P2 


— 02 


(2.23) 


(2.24) 


Now we may argue that dynamics of charge Qj should only depend on the electrostatic potential this 
charge “sees”. This means, in particular, that <j>\ should be a unique function of f\. Comparing Eq. (24) 
with the first of Eqs. (19), we see that for this to be true, Eq. (22) should indeed be valid. 

Equations (19) and (21) show that for the general case of arbitrary charges Q\ and Qj, the system 
properties cannot be reduced to just one coefficient (“capacitance”). Let us consider three particular 
cases when such a reduction is possible. 


(i) The system as the whole is electrically neutral: Q\ = -Q 2 = Q. In this case the most important 
function of Q is the difference of conductor potentials, called voltage : 15 


V — (j)y fa. 


(2.25) 


For that function, the subtraction of two Eqs. (19) gives 


v = 7T-, with c .n = 7 — y - ( — - — 7 > 

C „, {Pn+PnhKPu+Pix) 


(2.26) 


where coefficient C m is called the mutual capacitance between the conductors - or, again, just 
“capacitance”. The same coefficient describes the electrostatic energy of the system. Indeed, plugging 
Eq. (25) into Eq. (21), we see that both forms of Eq. (15) are reproduced if <j) is replaced with V, Q\ with 
Q, and C with C m : 


14 See, e.g., CM Chapter 2. 

15 A word of caution: in condensed matter physics, voltage is usually defined differently, as the difference of 
electrochemical rather than electrostatic potentials - see, e.g., SM Sec. 6.4. These two definitions coincide if the 
conductors have equal workfunctions (for example, if they are made of the same material), and in this course their 
difference will be ignored. 
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u - Q 

^ m 2 

2 C m 

2 


(2.27) 


The best known system for which the mutual capacitance C m may be readily calculated is the 
plane (or “parallel-plate”) capacitor, a system of two conductors separated with a narrow, plane gap 
(Fig. 2). Indeed, since the surface charges, that contribute to the opposite charges ±Q of the conductors 
in this system, attract each other, in the limit d « a they sit entirely on the sides of the narrow gap. 



d « a 


Fig. 2.2. Plane capacitor. 


Let us apply the Gauss law to a pillbox volume (shown by dashed line in Fig. 2) whose area is a 
small part of the gap (but nevertheless much larger than d~), with one of the plane lids inside a 
conductor, and another one inside the gap. The result immediately shows that the electric field within 
the gap is E = also, i.e. is independent of the pillbox thickness. Integrating this field across thickness d 
of the gap, we get V = Ed = od/so, so that cr= soV/d. But this voltage should not depend on the selection 
of the point of the gap area. As a result, <j should be also constant over all the gap area A, and hence Q = 
oA = SoV/d. Thus we may write V = Q/C m , with 


C =—A 
a 


(2.28) 


Let me offer a few comments on this well-known formula. First, it is valid even if the gap is not 
quite planar, for example if it gently curves on a scale much larger than d. Second, Eq. (28) is only valid 
if A ~ a~ is much larger than d , because its derivation ignores the electric field deviations from 
uniformity 16 at distances ~d near the gap edges. Finally, the same condition (A » d~) assures that C m is 
much larger than the self-capacitance of each of the conductors - see Eq. (18). The opportunities given 
by this fact for electronic engineering and experimental physics practice are rather astonishing. For 
example, a very realistic 3-nm layer of high-quality aluminum oxide (which may provide a nearly 
perfect electric insulation between two thin conducting films) with area of 0.1 m (which is a typical 
area of silicon wafers used in semiconductor industry) provides C m ~ 1 mF, 17 larger than the self- 
capacitance of the whole planet Earth! 


In the case shown in Fig. 2, the electrostatic coupling of the two conductors is evidently strong. 
As an opposite example of a weakly coupled system, let us consider two conducting spheres of the same 
radius R, separated by a much larger distance d (Fig. 3). 


16 Frequently referred to “fringe” fields resulting in an additional “stray” capacitance C m ’ ~ squ. 

17 Just as in Sec. 1, in order for the estimate to be realistic, I took into account the additional factor s r (for 
aluminum oxide, close to 10) which should be included into the nominator of Eq. (28) to make it applicable to 
dielectrics - see Chapter 3 below. 
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d » R 



Fig. 2.3. A system of two well separated, 
similar conducting spheres. 


In this case the diagonal components of matrix p n • may be approximately found from Eq. (16), 
i.e. by neglecting the coupling altogether: 


Pn ~ P 22 ~ 


1 

Ans 0 R 


(2.29) 


Now, if we had just one sphere (say, number 1), the electric potential at distance d from its center would 
be given by Eq. (16): (j)= Q\IAne^d. Now if we move into this point a small (R « d) sphere without its 
own charge, we may expect that its potential should not be too far from this result, so that (fp ~ QdAns^d. 
Comparing this expression with Eq. (19) (taken for Q 2 = 0), we get 


1 

Pn =Pn ~~ A T« Pn ’ P 22 • 

Ans n d 


(2.30) 


From here and Eq. (26), the mutual capacitance 


C„ 


1 


Pn + P 22 


2 7T£ 0 R . 


(2.31) 


We see that (somewhat counter-intuitively), in this case C m does not depend substantially on the 
distance between the spheres, i.e. does not describe their electrostatic coupling. The off-diagonal 
coefficients of the reciprocal capacitance matrix (20) play this role much better - see Eq. (30). 

(ii) Now let us consider the case when only one conductor of the two is charged, for example Q\ 
= Q, while Q 2 = 0. Then Eqs. (19) yield 

0i=PnQn (2-32) 


Now, if we follow Eq. (13) and define Cj = Xlpjj as the partial capacitance of conductor number j, we 
see that it differs from the mutual capacitance C m - cf. Eq. (26). For example, in the case shown in Fig. 
3, C\ = C 2 ~ Ane^R « 2 C m . 

(iii) Finally, let us consider a popular case when one of the conductors is charged by a certain 
charge (say, Q\ = Q ), but the potential of another one is sustained constant, say <jp = 0. 18 (This condition 
is especially easy to implement if the second conductor is much larger that the first one. Indeed, as the 
estimate (18) shows, in this case it would take much larger charge Q 2 to make potential (jp comparable 
with (f> 1 .) In this case the second of equations (19) yields Qi = - {pi\lp 22 )Qi- Plugging this relation into 
the first of those equations, we get 


18 In electrical engineering, such constant-potential conductor is called the ground. This term stems from the fact 
that in many cases the Earth surface may be considered a good electric ground, because its potential is unaffected 
by laboratory-scale electric charges. 


Chapter 2 


Page 9 of 64 





Essential Graduate Physics 


EM: Classical Electrodynamics 


0i = 


Pi i 


P 12 P 21 

P 22 


Qi 


Thus, if we treat the reciprocal of the expression in parentheses, 


*~ief 

L/ 1 




Pll 


Mi 

P 22 j 


(2.33) 


(2.34) 


as the effective capacitance of the first conductor, it is generally different both from C m and (unless the 
conductors are far apart and their electrostatic coupling is negligible) from C i = l/pu- 

To summarize this section, the potential (and hence the actual capacitance) of a conductor in a 
two-conductor system may be very much dependent on what exactly is being done with the second 
conductor when the first one is charged. This is also true for multi-conductor systems (for whose 
description, Eqs. (19) and (21) may be readily generalized); moreover, in that case even the mutual 
capacitance between two selected conductors may depend on the electrostatics conditions of other 
components of the system. 


2.3. The simplest boundary problems 

In the general case when the electric field distribution in the free space between the conductors 
cannot be readily found from the Gauss law or by any other special methods, the best approach is to try 
to solve the differential Laplace equation (1.42), with boundary conditions (lb): 

Typical 

(2.35) boundary 
problem 

where S k is the surface of the k - th conductor of the system. After such boundary problem has been 
solved, i.e. the spatial distribution (jff) has been found in all points outside the conductor, it is 
straightforward to use Eq. (3) to find the surface charge density, and finally the total charge 

Q k = fad 2 r (2.36) 

of each conductor, and hence any component of the reciprocal capacitance matrix py -. As an illustration, 
let us implement this program for three very simple problems. 

(i) Plane capacitor (Fig. 2). In this case, the easiest way to solve the Laplace equation is to use 
linear (Cartesian) coordinates with one coordinate axis, say z, normal to the conductor surfaces (Fig. 4). 




Fig. 2.4. Plane capacitor’s geometry used for the 
solution of the boundary problem (35). 


Chapter 2 


Page 10 of 64 


Essential Graduate Physics 


EM: Classical Electrodynamics 


In these coordinates, the Laplace operator is just the sum of three second derivatives. 19 It is 
evident that due to problem’s translational symmetry in the {x, y } plane, deep inside the gap (i.e. at the 
lateral distance from the edges much larger than d) the electrostatic potential may only depend on the 
coordinate perpendicular to the gap surfaces: <jiy) = (fiz). For such a function, derivatives over x and y 
vanish, and the boundary problem (35) is reduced to a very simple ordinary differential equation 

= 0, (2.37) 

dz" 

with boundary conditions 

^(0) = 0, l i(d) = V . (2.38) 

(For the sake of notation simplicity, I have used the discretion of adding a constant to the potential to 
make one of the potentials vanish, and also definition (25) of voltage V.) The general solution of Eq. 
(37) is a linear function: (f> (z) = c\z + ci, whose constant coefficients C \ t 2 may be found, in an elementary 
way, from the boundary conditions (38). The final solution is 

(j) = V~. 
d 

From here the only nonvanishing component of the electric field is 

dz d 

and the surface charge of the capacitor plates 

V 

= £ 0 E n = + £ 0 E z = ±£ o -7 ’ 

a 

where the upper and lower sign correspond to the upper and lower plate, respectively. Since cr does not 
depend on coordinates x and y, we can get the full charges Q\ = - Q 2 = Q of the surfaces by its 
multiplication by the gap area A, giving us the again already known result (26) for the mutual 
capacitance C m = Q/V. I believe that this calculation, though very easy, may serve as a good introduction 
to the boundary problem solution philosophy. 

(ii) Coaxial-cable capacitor. Coaxial cable is a system of two round cylindrical, coaxial 
conductors, with the cross-section shown in Fig. 5. 


(2.39) 

(2.40) 

(2.41) 



Fig. 2.5. Cross-section of a coaxial capacitor. 


19 See, e.g. MA Eq. (9.1). 
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Evidently, in this case the cylindrical coordinates {p, cp, z}, with axis z along the common axis of 
the cylinders, are most appropriate. Due to the axial symmetry of the problem, in these coordinates E(r) 
= n pE(p), (fir) = (/ip), so that in the general expression for the Laplace operator 20 we can take dldcp = 
dldz = 0. As a result, only the first (radial) term of the operator survives, and the boundary problem (35) 
takes the form 


]_d_f df 

p dp \ dp , 


= 0, 


<f>(a) = V, <f>(b) = 0 . 


(2.42) 


The sequential integration of this ordinary differential equation is elementary (and similar to that of the 
Poisson equation in spherical coordinates, performed in Sec. 1.3), giving 


d(f) 

dp 


Ap) = Cj [ — + c 2 = Cj In — + c 

J r> } n 


P 


P 


(2.43) 


P 


Constants ci ,2 may be found using boundary conditions (42): 

V = c 7 , 0 = c x In — + c 2 , 
a 

giving ci = -Vl\n(bla), so that solution (43) takes the following form 

/ 

Ap) = v 


In (p/aA 


In (b/ a) 


(2.44) 


(2.45) 


Next, for our axial symmetry the general expression for the gradient 21 is reduced to the radial derivative, 
so that 


E( p ) s -!lM = 


V 


dp p\n(b I a) 


(2.46) 


This expression, plugged into Eq. (2), allows us to find the density of conductors’ surface charge. For 
example, for the inner electrode 


= ^0 E a = 


SoV 


a\n(b! a) 


so that its full charge (per unit length of the system) is 


Q o 2 7T£ 0 V 

— = 2mo n = 

L In (b/a) 


(2.47) 


(2.48) 


(It is straightforward to check that the charge of the outer electrode is equal and opposite.) Hence, by 
the definition of the mutual capacitance, its value per unit length is 


C„ 


0 


2 7TS n 


L LV In (b/a) 


(2.49) 


20 See, e.g.,MAEq. (10.3). 

21 See, e.g.,MAEq. (10.2). 
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This expression shows that the total capacitance C is proportional to the systems length L (if L 
» a,b ), while being only logarithmically dependent on is the dimensions of its cross-section. Since log 
of a very large argument is an extremely slow function (sometimes called “quasi-constant”), if the 
external conductor is made large ( b » a) the capacitance diverges, but very weakly. Such a logarithmic 
divergence may be cut by any miniscule additional effect, for example by the finite length L of the 
system. This allows one to get a crude but very useful estimate of self-capacitance of a single wire: 

C ~ ln£ « L , for L»a. (2.50) 

In (LI a) 

On the other hand, if the gap between the conductors is narrow: b = a + d, with d « a, then In (b/a) = 
ln(l + d/a) may be approximated as d/a, and Eq. (49) is reduced to C m ~ InsoaL/d, i.e. to Eq. (28) for the 
plane capacitor, withal = 2mL. 

(iii) Spherical capacitor. This is a system of two conductors, with the same central cross-section 
as the coaxial cable (Fig. 5), but now with the spherical rather than axial symmetry. This symmetry 
implies that we are better off using spherical coordinates, so that potential <f) depends only on one of 
them, the distance r from the common center of the conductors: (fir) = (fir). As we already know from 
Sec. 1.3, in this case the general expression for the Laplace operator is reduced to its first (radial) term, 
so that the Laplace equation takes a simple fonn - see Eq. (1.47). Moreover, we have already found the 
general solution to this equation - see Eq. (1.50): 


<f){r) = — + c 1 , 
r 


(2.51) 


Now acting exactly as above, i.e. determining constant c\ from the boundary conditions (fia) = V, (fib) = 
0, we get 


V = Cl 


ly a b 


V 

so that <f>(r) = — 


ly a b 


+ c 


2 • 


(2.52) 


Next, we can use the spherical symmetry to find electric field, E(r) = n r E(r), with 


ifl-I 

dr r \a b 


(2.53) 


and hence its values on conductors’ surfaces, and then the surface charge density cr from Eq. (2). For 
example, for the inner conductor’s surface, 


= s 0 E(a) = 


V_ 

0 2 

a 


a b . 


so that, finally, for the full charge of that conductor we get 


Q = 4m a = 4ns ^ 


11 

a b 


,-i 


V . 


(2.54) 


(2.55) 


(Again, the charge of the outer conductor is equal and opposite.) Now we can use the definition of the 
mutual capacitance to get the final result 
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C 


m 


Q . f 1 

— = 4 ns* — 
V U 


n - 1 

by 


= 4 ns 


0 


ab 
b- a 


(2.56) 


For b »a, this result coincides with Eq. (17) for self-capacitance of the inner conductor. On the 
other hand, if the gap between two conductors is narrow, d = b - a « a. 


C_ = 4 ns, 


a(a + d ) 
d 


. a 
d 


(2.57) 


2 

i.e. the capacitance approaches that of the planar capacitor of area A = 4 na - as it should. 

All this seems (and is) very straightforward, but let us contemplate what was the reason for such 
easy successes. We have managed to find such coordinate transformations, for example {x, y, zj — > {r, 
9, cp } in the spherical case, that both the Laplace equation and the boundary conditions involve only one 
of the new coordinates (in this case, r). The necessary condition for the former fact is that the new 
coordinates (in this case, spherical ones) are orthogonal. This means that three vector components of 
differential dr, due to small variations of the new coordinates (say, dr, d6, and dtp), are mutually 
perpendicular. If this were not so, the Laplace operator would not fall into the simple sum of three 
independent parts, and could not be reduced, at the proper symmetry of the problem, to just one of these 
components, making it readily integrable. 


2.4, Other orthogonal coordinates 

Since the cylindrical and spherical coordinates are only simplest examples of the orthogonal (or 
“orthogonal curvilinear”) coordinates, this methodology may be extended to other coordinate systems of 
this type. As an example, let us have a look at the following problem: finding the self-capacitance of a 
thin, round conducting disk (and, as solution’s by-products, the distributions of the electric field and 
surface charge) - see Fig. 6. The cylindrical or spherical coordinates would not give too much help here, 
because though they have the appropriate axial symmetry about axis z, they would make the boundary 
condition on the disk too complex (two coordinates, either p and z, or r and 9). 



Fig. 2.6. The thin conducting disk problem. (The cross- 
section of the system by the vertical plane y = 0.) 


The relief comes from noting that the disk, i.e. the area z = 0, r < R, may be thought of as the 
limiting case of an axially -symmetric ellipsoid - the result of rotation of the usual ellipse about one of its 
axes - in our case, the vertical axis z. 22 Analytically, such an ellipsoid may be described by the following 
equation: 


22 Alternative names for this surface are “degenerate ellipsoid”, “ellipsoid of rotation”, and “spheroid”. 
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2,2 2 

X + V Z 

1 = 1 

2 T i 2 ’ 

a b 


(2.58) 


where a and b are the so-called major semi-axes whose ratio determines the ellipse eccentricity (the 
degree of squeezing). For our problem, we will only need oblate ellipsoids with a > b; according to Eq. 
(58), they may be presented as surfaces of constant a in the system of degenerate ellipsoidal (or 
“spheroidal”) coordinates {a, [j tp), which are related to the Cartesian coordinates as follows: 

x = R cosh a sin /? cos tp, 

y = R cosh a sin /? sin cp, (2.59) 

z = R sinh a cos /?. 


Such ellipsoidal coordinates are the evident generalization of the spherical coordinates, which 
correspond to the limit a » 1 (i.e. r » R). In the opposite limit of small a, the surface of constant a = 
0 describes our thin disk of radius R. It is almost evident (and easy to prove) that coordinates (59) are 
also orthogonal, so that the Laplace operator may be expressed as a sum of three independent terms: 


V 2 = 


1 


R (cosh a — sin" /?) 


1 


cosh a da 

1 a 


8 f , 8 ^ 

cosher — 
8a 


sin J3 8j3 


sin B — 
8(3 


J 

\ r 

+ 


1 


1 


(2.60) 


sin"/? cosh" a 


8cp 1 


Though this expression may look a bit intimidating, let us notice that in our current problem, the 
boundary conditions depend only on coordinate a: 23 


<t>a=0=V, ^« = oo=0. 


(2.61) 


Hence there is every reason to believe that the electrostatic potential in all space is the function of a 
alone. (In other words, all ellipsoids a = const are the equipotential surfaces.) Indeed, acting on such 
function (fAa) by the Laplace operator (60), we see that the two last terms in the square brackets vanish, 
and the Laplace equation (35) is reduced to a simple ordinary differential equation 


d_ 

da 


cosh a — 
da 


= 0 . 


(2.62) 


Integrating it twice, just as we did in the previous problems, we get 


<j){a) = c J 


da 


cosh a 


(2.63) 


This integral may be readily taken, for example, using the substitution g = sinha (with dq = cosha da, 

2 2 2 

cosh“a= 1 + sinh" a= 1 + £ ): 

sinh a 

(j){a) = c x J ~r + c 2 =c i arctan(sinha) + c 2 . (2.64) 

0 1 + 


23 I have called disk’s potential V, to distinguish it from the potential (j) at an arbitrary point of space. 
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The integration constants ci ,2 are again simply found from boundary conditions, in this case Eqs. (61), 
and we arrive at the final expression for the electrostatic potential: 


(p{a) = V 


1 arctan(sinha) . 

n 


(2.65) 


This solution satisfies both the Laplace equation and the boundary conditions. Mathematicians tell us 
that the solution of any boundary problem of the type (35) is unique, so we do not need to look any 
further. 


Now we may use Eq. (2) to find the surface density of electric charge, but in the case of thin 

disk, it is more natural to add up such densities on its top and bottom surfaces at the same distance r = 

2 2 1/2 

(x + y) ~ from the disk center (which are evidently equal, due to the problem symmetry about plane z = 
0): cr= 2£b£’„| z =+o. According to Eq. (65), the electric field on the surface is 


f¥i Mg) I =— V 1 - 2 y 1 

dz ~“ t0 <9(f? sinh a cos /?) “~ +0 n Rcos/3 n ( R 2 -r 2 ) 1/2 ’ 


( 2 . 66 ) 


and we see that the charge is distributed along the disk very nonunifonnly: 


= I 

x 0 (R 2 -r 2 ) 1/2 3 


(2.67) 


with a singularity at the disk edge. Below we will see that such singularities are very typical for sharp 
edges of conductors. 24 Fortunately, in our current case the divergence is integrable, giving a finite disk 
charge: 

Q= l<rd 2 r = j er(r)2xrdr = - sV 2gf rdr = 4 s,VR\ -fL = 8 s,RV. (2.68) 

id 0 * l(R -r 2 )' 11 J »VW 

surface 


Thus, for disk’s self-capacitance we get a very simple result, 

C = 8s 0 R=-4tz£ 0 R, (2.69) 

TC 

a factor of 2 In ~ 0.64 lower than that for the conducting sphere of the same equal radius, but still 
complying with the general estimate (18). 

Can we always find a “good” system of orthogonal coordinates? Unfortunately, the answer is no, 
even for highly symmetric geometries. This is why the practical value of this approach is limited, and 
other methods of boundary problems are clearly needed. Before moving to them, however, let us note 
that in the case of 2D problems (i.e. cylindrical geometries), the orthogonal coordinate method gets help 
from the following conformal mapping approach. 

Let us consider the pair of Cartesian coordinates {x, y} of the cross-section plane as a complex 
variable ■z = x + iy, 25 where i is the imaginary unity (/ = -1), and let wfz) = u + iv be an analytic complex 


24 If you seriously worry about the formal infinity of charge density at r — > R, please remember that this 
mathematical artifact disappears for any nonvanishing disk thickness. 

25 The complex variable « should not be confused with the (real) 3 rd spatial coordinate z! We are considering 2D 
problems now, with the potential independent of z. 
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function of ^. 26 For our current purposes, the most important property of an analytic function is that its 
real and imaginary parts obey the following Cauchy -Riemann relations : 27 


du _ dv dv _ du 

dx dy ’ dx dy 


(2.70) 


For example, for the function 

ia= a = {x + iy ) 2 = (x 2 -y 2 ) + 2 ixy, (2.71) 

whose real and imaginary parts are 

u = Reut = x 2 - v 2 , v = Imui = 2xy, (2.72) 


we immediately see that out ox = 2x= dv/dy, and dv/dx = 2 y = -duldy, in accordance with Eq. (70). 

Let us differentiate the first of Eqs. (70) over x again, then change the order of differentiation, 
and after that use the latter of those equations: 


8 2 u _ddu_ddv_ddv_ 8 du _ 8 2 u 

8x 2 8x dx dx dy dy dx dy dy dy 2 ’ 


(2.73) 


and similarly for v. This means that the sum of second-order partial derivatives of each of real functions 
u{x,y) and v(x,y) is zero, i.e. that both functions obey the 2D Laplace equation. This mathematical fact 
opens a nice way of solving problems of electrostatics for (relatively simple) 2D geometries. Imagine 
that for a particular boundary problem we have found a function wf£) for which either u(x, y) or v(x, y) 
is constant on all electrode surfaces. Then all lines of constant u (or v) present equipotential surfaces, i.e. 
the problem of the potential distribution has been essentially solved. 

As a simple example, consider a practically important problem: the quadrupole electrostatic 
lens- a system of four cylindrical 28 electrodes with hyperbolic cross-sections, whose boundaries obey the 
following relations: 

2 2 _ I + a 2 , for the left and right electrodes, 

[ - a 2 , for the top and bottom electrodes, 


voltage-biased as shown in Fig. 7a. Comparing these relations with Eqs. (72), we see that each electrode 
surface corresponds to a constant value of u = ±a~. Moreover, potentials of both surfaces with u = +a 
are equal to +V/2, while those with u = -a are equal to -E/2. Hence we may conjecture that the 
electrostatic potential at each point is a function of u alone; moreover, a simple linear function, 

(j) = CjW +c 2 = cfx 2 -y 2 ) + c 2 , (2.75) 


26 The analytic (or “holomorphic”) function may be defined as the one that may be expanded into the complex 
Taylor series, i.e. is infinitely differentiable in the given point. (Almost all “regular” functions, such as a. ", f ”, 
exp a. In a,, etc. and their combinations are analytic at all a,, maybe besides certain special points.) If the reader 
needs to brush up his or her background on this subject, I can recommend a popular (and very inexpensive :-) 
textbook by M. Spiegel et al.. Complex Variables, 2 nd ed., McGraw-Hill, 2009. 

27 These relations may be, in particular, to prove the famous Cauchy integral formula - see, e.g., MA Eq. (15.1). 

28 Let me remind the reader that in mathematics, term cylindrical describes a surface formed by translation, along 
a straight line, of an arbitrary curve, and hence more general than the usual circular cylinder. (In this terminology, 
for example, a prism is also a particular form of cylinder, formed by translating a polygon.) 
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is a valid (and hence the unique) solution of our boundary problem. Indeed, it does satisfy the Laplace 
equation, while its constants c\g may be selected in a way to satisfy all the boundary conditions shown 
in Fig. 7a: 


</> = 


V x 2 -y 2 

0 2 

1 a 


(2.76) 


so that the boundary problem has been solved. 



Fig. 2.7. (a) Quadrupole electrostatic lens geometry and (b) its analysis using conformal mapping. 


According to Eq. (76), all equipotential surfaces are hyperbolic cylinders, similar to those of the 
electrode surfaces. What remains is to find the electric field at an arbitrary point inside the system: 


E 


X 


d(j) 

dx 



dy a 2 


(2.77) 


These formulas show that if charged particles (e.g., electrons in an electron optics system) are launched 
to fly ballistically through the lens, along axis z, they experience a force pushing them toward the 
symmetry axis and proportional to particle’s deviation from the axis (and thus equivalent in action to an 
optical lens with positive refraction power) in one direction, and a force pushing them out (negative 
refractive power) in the perpendicular direction. One can show that letting charged particles fly through 
several such lenses, with alternating voltage polarities, in series, enables beam focusing. 29 

Hence, we have reduced the 2D Laplace boundary problem to that of finding the proper analytic 
function wf£). This task may be also understood as that of finding a conformal map, i.e. a 
correspondence between components of any point pair, {x, y) and {u, v}, residing, respectively, on the 
initial Cartesian plane a, and the plane ui of the new variables. For example, Eq. (74) maps the real 
electrode configuration onto the plane capacitor with infinite area (Fig. 7b), and the simplicity of Eq. 
(75) is due to the fact that for the latter system the equipotential surfaces are just parallel planes. 

For more complex geometries, the suitable analytic function wfz) may be hard to find. However, 
for conductors with piece-linear cross-section boundaries, substantial help may be obtained from the 
following Schwarz-Christoffel integral 


29 See, e.g., textbook by P. Grivet, Electron Optics, 2 nd ed., Pergamon, 1972, or the review collection A. Septier 
(ed.), Focusing Charged Particles, vol. I, Academic Press, 1967, in particular the review by K.-J. Hanszen and R. 
Lauer, pp. 251-307. 
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ia(z) 


= const x 


dz. 

(* - X l) k ' (* - X 2 Y 2 •■•(*- %-l )* 


(2.78) 


that provides the conformal mapping of the interior of an arbitrary .Y-sided polygon on plane w = u + iv, 
and the upper-half (y > 0) of plane z = x + iy. Here xj (j = 1, 2, N - 1) are the points of axis y = 0 (i.e., of 
the boundary of the mapped region on plane z) to which the corresponding polygon vertices are mapped, 
while kj are the exterior angles at the polygon vertices, measured in the units of n , with - \<kj <+\- 
see Fig. 8. 30 Of points xj, two may be selected arbitrarily (because their effects may be compensated by 
the multiplicative constant in Eq. (78), and the constant of integration), while all the others have to be 
adjusted to provide the correct mapping. 



In the general case, the complex integral (78) may be hard to tackle. However, in some important 
cases, in particular those with right angles (kj = ±14) and/or with some points w, at infinity, the integrals 
may be readily worked out, giving explicit analytical expressions for the mapping functions ia(z). For 
example, let us consider a semi-infinite strip, defined by restrictions -1 < u < +1 and 0 < v, on plane ut - 
see the left panel of Fig. 9. 



Fig. 2.9. Semi-infinite 
strip mapped onto the 
upper half-plane. 


30 Integral (70) includes only (N- 1) rather than N poles, because polygon’s shape is completely determined by (N 
- 1) positions at, of its vertices and (/V - 1) angles nk,. In particular, since the algebraic sum of all external angles 
of a polygon equals n, the last angle parameter kj = k N is uniquely determined by the set of the previous ones. 
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The strip may be considered as a polygon, with one vertex at the infinitely distant vertical point 
ut 3 = 0 + zoo. Let us map it on the upper half of plane -z, shown on the right panel of Fig. 9, with vertex 
ut\= - \ + z'O mapped onto point x\ = -1, y\ = 0, and vertex ut 2 = +1 + z'O mapped onto point x 2 = +1 , yi = 
0. Since in this case both external angles are equal to +/r/2, and hence k\ = k 2 = +14, Eq. (78) is reduced 
to 

/ x f d'z r d'z. . r d'z. 

= const x 1 ( « + m«-iv« = const x 1 = const x '1 (TLr - <2 79) 


This complex integral may be taken, just as for real «, by the substitution « = sing, giving 

arcsin 'z, 

w(z) = const' x j dg = c^rcsinz + c 2 . 


(2.80) 


Determining constants ci j2 from the required mapping, i.e. from the equations w {- 1 + z'O) = -1 + z'O and 
uk+ 1+ z'0)= +1+ z'O (see Fig. 9), we finally get 

. . 2 . . 7 rut . 

ut(<z) = —arcsine, i.e. «=sin — . ( 2 . 81 a) 

n 2 


Using the well-known expression for the sine of a complex argument, 31 we may rewrite this elegant 
result in either of the two following forms for the real and imaginary components of « and at: 


u = — arcsin 
n 


2x 


r ii/ 2 r 1T7F’ v = — arccosh 

[(x + l) 2 +y 2 ] +[(x-l) 2 +y 2 ] " * 


2 , (x + 1)~ + y 


1/2 


+ [(x-l) 2 +y 2 


1/2 


. nu . 7tv 7VU . , 7TV 

x = sin — cosh — , v = cos — sinh — . 
2 2 2 2 


(2.81b) 


It is amazing how perfectly does the last formula manage to keep y = 0 at different borders of our w- 
region (Fig. 9): at its side borders (zz = ±1, 0 < v < oo), this is performed by the first multiplier, while at 
the bottom border (-1 < u < +1, v = 0), the equality is insured by the second operand. 


This mapping may be used to solve several electrostatics problems with the geometry shown in 
Fig. 9; probably the most surprising of them is the following one. A straight gap of width It is cut in a 
thin conducting plane, and voltage V is applied between the resulting half-planes - see the bold lines in 
Fig. 10. Selecting a Cartesian coordinate system with axis z along the cut, axis y perpendicular to the 
plane, and the origin in the middle of the cut, we can write the boundary conditions of this Laplace 
problem as 


J + V / 2, at x > t, y = 0, 
\-V 1 2, atx<-fiy = 0. 


(2.82) 


(Due to problem’s symmetry, we may expect that in the middle of the gap, i.e. at -t < x < +t and y = 0, 
the electric field is parallel to the plane and hence d(f>ldy = 0.) The comparison of Figs. 9 and 10 shows 
that if we nonnalize our coordinates to t, Eq. (81) provides the conformal mapping of our system on 
plane a, to the field in a plane capacitor on plane ut, with voltage Ubetween two planes u = ±1. Since we 


31 See, e.g., MA Eq. (3.5). 
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already know that in that case (f> = (V/2 )u, we may immediately use the first of Eqs. (81b) to write the 
final solution of the problem (in the dimensional coordinates): 32 


<t> 



V 

= — arcsin 


n 





(2.83) 



Fig. 2.10. Equipotential surfaces of 
the electric field between two thin 
conducting semi-planes (or rather 
their cross-sections by the 
perpendicular plane z = const). 


Thin lines in Fig. 10 show the corresponding equipotential surfaces; 33 it is evident that the 
electric field concentrates at the gap edges, just as it did at the edge of the thin disk (Fig. 6). Fet me 
leave the remaining calculation of the surface charge distribution and the mutual capacitance between 
the half-planes (per unit length) for reader’s exercise. 


2.5. Variable separation 

The general approach of the methods discussed in the last two sections was to satisfy the Faplace 
equation by a function of a single variable that also satisfies the boundary conditions. Unfortunately, in 
many cases this cannot be done (at least, using practicably simple functions). In this case, a very 
powerful method, called variable separation, may work, frequently producing “semi-analytical” results 
in the fonn of an infinite series of either elementary or well-studied special functions. The main idea of 
the method is to present the solution of the general boundary problem (35) as the sum of partial 
solutions, 

* = 2>, 4. (2.84) 

k 

where each function <f>k satisfies the Faplace equation, and then select the set of coefficients cy to satisfy 
the boundary conditions. More specifically, in the variable separation method the partial solutions (f>k are 
looked for in the form of a product of functions, each depending of just one spatial coordinate. 


32 This result could also be obtained using the so-called elliptical (not ellipsoidal!) coordinates. 

33 Another graphical representation of the electric field distribution, by field lines, is much less convenient. As a 
reminder, the field lines are defined as lines to whom the (in our current case, electrostatic) field vectors are 
tangential at each point. By this definition, the field lines are always normal to the equipotential surfaces, so that it 
is always straightforward to sketch them from the equipotential surface pattern - such as shown in Fig. 10. 
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(i) Cartesian coordinates . Let us discuss this approach on the classical example of a rectangular 
box with conducting walls (Fig. 11), with the same potential (that I will take for zero) at all the walls, 
but a different potential V fixed at the top lid. Moreover, in order to demonstrate the power of the 
variable separation method, let us carry out all the calculations for a more general case when the top 
lead potential is an arbitrary 2D function V(x, v). 34 



Fig. 2.11. Standard playground for the variable 
separation method discussion: a rectangular box 
with five conducting, grounded walls and a fixed 
potential distribution V(x, v) on the top lid. 


For this geometry, it is natural to use Cartesian coordinates {x, y, z } and hence present each of 
the partial solutions in Eq. (84) as a product 


<t> k =X(x)Y(y)Z(z). 

Plugging it into the Laplace equation expressed in the Cartesian coordinates, 


d 2 A , d 2 tf> k , 5 V, 


- + 


+ - 


dz 7 


= 0 , 


dx dry 

and dividing the result by product XYZ, we get 

1 d 2 X 1 d 2 Y 1 d 2 Z 


(2.85) 


( 2 . 86 ) 


- + - 


- + - 


X dx 1 Y dy Z dz " 


= 0 . 


(2.87) 


Here comes the punch line of the variable separation method: since the first term of this sum may 
depend only on x, the second one only of v, etc., Eq. (87) may be satisfied everywhere in the volume 
only if each of these terms equals a constant. In a minute we will see that for our current problem (Fig. 
1 1), these constant x- and v-terms have to be negative; hence let us denote these variable separation 
constants as (-a 2 ) and {-(?), respectively. Now Eq. (87) shows that the constant z-term has to be 
positive; if we denote it as f, we get the following relation: 

a 2 + /3 2 =y 2 . (2.88) 

Now the variables are separated in the sense that for functions X{x), Y(y), and Z(z) we have got 
separate ordinary differential equations, 


34 Such distributions may be implemented in practice using so-called mosaic electrodes consisting of many 
electrically-insulated and individually-biased panels. 
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^r + a 2 X = 0, ^ + /? 2 F = 0, ^--y 2 Z = 0, (2.89) 

dx~ dy dz 

which are related only by Eq. (88) for their parameters. Let us start from the equation for function X{x). 
Its general solution is the sum of functions sinew and cosew, multiplied by arbitrary coefficients. Let us 
select these coefficients to satisfy our boundary conditions. First, since (j) oc X should vanish at the back 
vertical wall of the box (i.e., with the choice of coordinate origin shown in Fig. 1 1, at x = 0 for any v and 
z), the coefficient at cosew should be zero. The remaining coefficient (at sinew) may be included into the 
general factor Ck in Eq. (84), so that we may take X in the form 

X- sinew. (2.90) 

This solution satisfies the boundary condition at the opposite wall (x = a) only if its argument aa is a 
multiple of n , i.e. if a is equal to any of the following numbers (commonly called eigenvalues ): 35 

a„=-n, n = 1,2,... (2.91) 

a 

(Terms with negative values of n would not be linearly-independent from those with positive n, and may 
be dropped from the sum (84). Value n = 0 is formally possible, but would give X = 0, i.e. (j>k = 0, at any 
x, i.e. no contribution to sum (84), so it may be dropped as well.) Now we see that we indeed had to 
take a real, (i.e. cC positive); otherwise, instead of the oscillating function (90) we would have a sum of 
two exponential functions, which cannot equal zero in two independent points of axis x. 

Since the equation for function Y(y) is similar to that for X(x), and the boundary conditions on 
the walls perpendicular to axis y (y = 0 and y = b ) are similar to those for x- walls, the absolutely similar 
reasoning gives 


n 


Y = sin J3y, J3 m = —m, m = 1, 2,... , 
b 


(2.92) 


where the choice of integer m is independent of that of integer n. Now we see that according to Eq. (88), 
the separation constant y depends on two indices, n and m, so that the relation may be rewritten as 


y = ler + ft~ = 

/ nm L n / m 


f V 
' n ' 


n\ 


+ 


\a) 


f V 

1 m ' 


J 


n 1/2 


(2.93) 


The corresponding solution of the differential equation for Z may be presented as a sum of two 
exponents exp{±^„„z}, or alternatively as a linear combination of two hyperbolic functions, sinh^„„z and 
cosh^„„z, with arbitrary coefficients. At our choice of coordinate origin, the latter option is preferable, 
because cosh y, m ,z cannot satisfy the zero boundary condition at the bottom lid of the box (z = 0). Hence 
we may take Z in the fonn 

Z = sinh y nm z (2.94) 


35 Note that according to Eqs. (91)-(92), as the spatial dimensions a and b of the system are increased, the 
distances between adjacent eigenvalues tend to zero. This fact implies that for spatially-infmite, non-periodic 
systems, the eigenvalue spectra are continuous, so that the sums of the type (84) become integrals. A few 
problems of this type are provided in Sec. 9 for reader’s exercise. 
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that automatically satisfies that condition. 


Now it is the right time to combine Eqs. (84) and (85) for our case in a more explicit form, 
replacing symbol k for the set of two integer indices n and m: 



(2.95) 


where y nm is given by Eq. (93). This solution satisfies our boundary conditions on all walls of the box, 
besides the top lid, for arbitrary coefficients c nm . The only job left for us is to choose these coefficients 
from the top-lid requirement: 


oo 

</>(x,y,c) = V(x,y) = ^c„ m 

n,m = 1 


. mix . mnv . , 
sin sin — — sinhr c . 

z / nm 

a b 


(2.96) 


It seems like a bad luck to have just one equation for the infinite set of coefficients c nm ■ However, the 
decisive help come from the fact that the functions of x and y that participate in Eq. (96), form full, 
orthogonal sets of ID functions. The last term means that the integrals of the products of the functions 
with different integer indices over the region of interest equal zero. Indeed, direct integration gives 


. mx . jm'x , \a! 2, for n = n\ 

sin sin ax = < 

a a [0, for n^n', 


(2.97) 


and similarly for y (with evident replacements a — » b, n — » in). Hence, the fruitful way to proceed is to 
multiply both sides of Eq. (96) by the product of the basis functions, with arbitrary indices n ’ and m ’, 
and integrate the result over x and y : 


r , r , Tr( . 7m x . 7mi v ^ 

I ax I ay V (x, y) sin sin — = 

on a b n.m=\ 


Smh Ynm C 


mix . mi x 
sin sin- 


Jsi 


r . 7uny . mn y 

dxx sin sin ay. (2.98) 

J b b 


mny 


Due to Eq. (97), all terms in the right-hand part of the last equation, besides those with n = n’ and m = 
m ’, vanish, and (replacing n ’ with n, and m ’ with m) we finally get 


absmhy nm c 0 


j dxj dyV{x,y) sin sin 


mny 


(2.99) 


Relations (93), (95) and (99) present the complete solution of the posed boundary problem; we 
can see both good and bad news here. The first bit of bad news is that in the general case we still need to 
work out (formally, the infinite number of) integrals (99). In some cases, it is possible to do this 
analytically. For example, in our initial problem of constant potential on the top lid, V(x,y) = const = Vo, 
both ID integrations are elementary; for example 


r . mix , 2a 1, for n odd, 

sin dx = — x j 

• a mi { 0, for n even, 

and similarly for the integral over y, so that 

1, if both n and m are odd, 
0, otherwise. 


c = • 


16 H q 

7T 2 nmsmhy nm c 


( 2 . 100 ) 


( 2 . 101 ) 


Variable 
separation 
in Cartesian 
coordinates 
(example) 
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The second bad news is that even at such a happy occasion, we still have to sum up the infinite series 
(95), so that our result may only be called analytical with some reservations, because in most cases we 
need a computer to get the finial numbers or plots. 

Now the first good news. Computers are very efficient for both operations (95) and (99), i.e. 
summation and integration. (As was discussed in Sec. 1.2, random errors are averaged out at these 
operations.) As an example, Fig. 12 shows the plots of the electrostatic potential in a cubic box {a = b = 
c ), with an equipotential top lid (V = Vo = const), obtained by numerical summation of series (95), using 
the analytical expression (101). The remarkable feature of this calculation is the very fast convergence 
of the series; for the middle cross-section of the cubic box (z/c = 0.5), already the first term (with n = m 
= 1) gives accuracy about 6%, while the sum of four leading terms (with n, m = 1, 3) reduces the error to 
just 0.2%. (For a longer box, c > a, b, the convergence is even faster - see the discussion below.) Only 
close to the comers between the top lid and the side walls, where the potential changes very rapidly, 
several more terms are necessary to get a reasonable accuracy. 




x/a z/c 

Fig. 2.12. Distribution of the electrostatic potential within a cubic box (a = b = c) with constant voltage Vo on 
the top lid (Fig. 1 1), calculated numerically from Eqs. (93), (95) and (101). The dashed line on the left panel 
shows the contribution of the main term (with n = m = 1) to the full result. 


The second good news is that our “semi-analytical” result allow its ultimate limits to be explored 
analytically. For example, Eq. (93) shows that for a very flat box (c « a, b ), y n ,,„z ^ Yn.mC « 1 at least 
for the lowest terms of series (95), with n, m « da, c/b. In these terms, sinh functions in Eqs. (96) and 
(99) may be well approximated with their arguments, and their ratio by z/c. This means that if we limit 
the summation to these term, Eq. (95) gives a very simple result 
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J7 

<t>(x,y)«-V(x,y) 

c 


(2.102) 


which means that each segment of the flat box behaves just as a plane capacitor. Only near the vertical 
walls (or near possible locations where V(x,y) is changed sharply), the higher terms in the series (95) are 
important, producing deviations from Eq. (102). In the opposite limit (a, b « c ), Eq. (93) shows that, in 
contrast, y vn c « 1 for all n and in. Moreover, the ratio sm\\y nm zlsm\\y nm c drops sharply if either n or m 
is increased, if z is not too close to c. Hence in this case a very good approximation may be obtained by 
keeping just the leading tenn, with n = m = 1, in Eq. (95), so that the problem of summation disappears. 
(We saw above that this approximation works reasonably well even for a cubic box.) In particular, for 
the constant potential of the upper lid, we can use Eq. (101) and the exponential asymptotic for both sinh 
functions, to get a very simple formula: 


<t> 


16 . 7a . Try \ (o’ +b 2 ) 

— - sin — sin — exp < - n A L — 

n a b \ ab 


( c-z ) 


(2.103) 


The same variable separation method may be used to solve more general problems as well. For 
example, if all walls of the box shown in Fig. 1 1 have an arbitrary potential distribution, one can use the 
linear superposition principle to argue that the electrostatic potential distribution inside the box as the 
sum of 6 partial solutions of the type of Eq. (95), each with one wall biased by the corresponding 
voltage, and all other grounded (<j>= 0). 


To summarize, the results given by the variable separation method are closer to what we could 
call a genuinely analytical solution than to purely numerical solutions - see Sec. 6 below. Now, let us 
explore the issues that arise when this method is applied in other orthogonal coordinate systems. 


(ii) Polar coordinates . If a system of conductors is cylindrical, the potential distribution is 
independent of the coordinate z along the cylinder axis: d<f>!dz =0, and the Laplace equation becomes 
two-dimensional. If conductor’s cross-section is rectangular, the variable separation method works best 
in Cartesian coordinates {x, y}, and is just a particular case of the 3D solution discussed above. 
However, if the cross-section is circular, much more compact results may be obtained by using polar 
coordinates {p, cp } . As we already know from the last section, these 2D coordinates are orthogonal, so 
that the two-dimensional Laplace operator is a simple sum. 36 Requiring, just as we have done above, 
each component of sum (84) to satisfy the Laplace equation, we get 


\__d_ 

P dp 


( 

P 

v 



dp ) 


p 2 d(p 2 


(2.104) 


In a full analogy with Eq. (75), let us present each particular solution as a product: <j) k = ^ p)f((p ). 
Plugging this expression into Eq. (104) and then dividing all its parts by Pflp , we get 

dp^ 


p d 
P dp 


P~ 


A^-o. 


dp ) f d(p~ 


(2.105) 


Following the same reasoning as for the Cartesian coordinates, we get two separated ordinary 
differential equations 


36 See, e.g., MA Eq. (10.3) with d/dz = 0. 
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P 


dp 

dj? 

d(p 


d’Z\ 2 _ 
p— =v 
dp) 

(2.106) 

? 

~ + v 2 f = 0, 

(2.107) 


where F 2 is the variable separation constant. 

Let us start their analysis from Eq. (106), plugging into it a probe solution Z? = cp\ where c and 
a are some constants. Elementary differentiation shows that if a ^ 0, the equation is indeed satisfied for 
any c, with just one requirement on constant a, namely cic = V. This means that the following linear 
superposition 

^ = a v p +v + b v p v , forn^O, (2.108) 


with constant coefficients a v and b v , is also a solution to Eq. (106). Moreover, the general theory of 
linear ordinary differential equations tells us that the solution of a second-order equation like Eq. (106) 
may only depend on just two constant factors that scale two linearly-independent functions. Hence, for 
all values V ^ 0, Eq. (108) presents the general solution of that equation. The case when v= 0, in which 
functions p +v and p ~ v are just constants and hence are not linearly-independent, is special, but in this 
case the integration of Eq. (106) is straightforward, 37 giving 

5? = a 0 + b 0 In p, fori/ = 0. (2.109) 

In order to specify the separation constant, we should use Eq. (107), whose general solution is 

fc„ cos vcp + 5 ,, sin vcp, forn^O, 

f = \ v ' ( 2 . 110 ) 

yc Q +s Q (p, fori/ = 0. 

There are two possible cases here. In many boundary problems solvable in cylindrical coordinates, the 
free space region, in which the Laplace equation is valid, extends continuously around the origin point p 
= 0. In this region, the potential has to be continuous and uniquely defined, so that f has to be a 2n- 

periodic function of angle cp . For that, one needs v{cp +2 it) to be equal to vcp + 2 mi, with n an integer, 
immediately giving us a discrete spectrum of possible values of the variable separation constant: 

v = n = 0, ± 1, ± 2 ,... (2.111) 

In this case both functions ^and ^ may be labeled with the integer index n. Taking into account that the 
terms with negative values of n may be summed up with those with positive n, and that .s' 0 should equal 
zero (otherwise the 2 ^--periodicity of function f would be violated), we see that the general solution to 
the 2D Laplace equation may be presented as 

Variable 
separation 
in polar 
coordinates 

Let us see how all this machinery works on the classical problem of a round cylindrical 
conductor placed into an electric field that is uniform and perpendicular to cylinder’s axis at large 



( 2 . 112 ) 


37 Actually, we have already done it in Sec. 3 - see Eq. (43). 
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distances - see Fig. 13a. 38 First of all, let us explore the effect of system’s symmetries on coefficients in 
Eq. (112). Selecting the coordinate system as shown in Fig. 13a, and taking the cylinder’s potential for 
zero, we immediately have ao = 0. Moreover, due to the mirror symmetry about plane [x, z], the solution 
has to be an even function of angle cp, and hence all coefficients s n should also equal zero. Also, at large 
distances (p » R ) from the cylinder axis its effect on the electric field should vanish, and the potential 
should approach that of the unifonn field E = Eon x : 

</> — » ~E 0 x = -E 0 pcos(p, for p — » oo . (2.113) 

This is only possible if in Eq. (112), bo = 0, and also all coefficients a„ with n ^ 1 vanish, while product 
a\C\ should be equal to (-Eq). Thus the solution is reduced to the following form 

00 B 

(/)(P, <p) = ~E 0 p cos q> + ■ —7 cos n <p , (2.11 4) 

/ 2— 1 P 

in which coefficients B„ = b n c n should be found from the boundary condition on the cylinder’s surface, 
i.e. at p = R: 

</>(R,<p) = 0. (2.115) 


(a) 




Fig. 2.13. Conducting cylinder inserted into an initially uniform electric field perpendicular to is 
axis: (a) the problem’s geometry, and (b) the equipotential surfaces given by Eq. (117). 


This requirement yields the following equation, 


A 

R 


-E 0 R 


\ 00 B 

cos (p + — n - cos n<p = 0 , 

n=2 R" 


( 2 . 116 ) 


38 This problem does belong to our current topic of electrostatic fields between conductors, because the uniform 
electric field may be created by a large plane capacitor. 
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which should be satisfied for all cp . But since functions cos rup are orthogonal, this equality is only 
possible if all B n for n > 2 are equal zero, while B\ = EqR~. Hence our final answer (which is of course 
only valid outside of the cylinder, i.e. for p>R ), is 


</>(P,<P) = ~Eo 


P 


R 


2 A 


P J 


cos (p = -E f 


1- 


R J 


i 2,2, 

^ x +y ) 


X . 


(2.117) 


This result (Fig. 13b) shows a smooth transition between the uniform field (113) far from the 
cylinder, to the equipotential surface of the cylinder (with (f) = 0). Such smoothening is very typical for 
Laplace equation solutions. Indeed, as we know from Chapter 1, these solutions corresponds to the 
lowest potential energy (1.67), and hence the lowest values of potential gradient modulus, possible at the 
given boundary conditions. 

To complete the problem, let us calculate the distribution of the surface charge density over the 
cylinder’s cross-section, using Eq. (3): 


(7 — s 0 E n surface — e 0 


d(j) 

dp 


d 

p=R =s 0 E 0 cos(p— 


P 


R 2 \ 

p V 


P=R 


= 2 e 0 E 0 cos (p. 


(2.118) 


This very simple formula shows that at the field direction shown in Fig. 13a (E 0 > 0), the surface charge 
is positive on the right side of the cylinder and negative on its left side, thus creating a field directed 
from the right to the left, that compensates the external field inside the conductor, where the net field is 
zero. Note also that the net electric charge of the cylinder is zero, in the correspondence with the 
problem symmetry. Another useful by-product of calculation (118) is that the surface electric field 
equals 2EoCOS(p, and hence its largest magnitude is twice the field far from the cylinder. Such electric 
field concentration is very typical for all convex conducting surfaces. 

The last observation gets additional confirmation for the second possible topology, when Eq. 
(110) is used to describe problems with no angular periodicity. A typical example is a cylindrical 
conductor with a cross-section that features a corner limited by straight lines (Fig. 14). Indeed, at we 
may argue that at p < R (where R is the scale of radial extension of the straight sides of the corner), the 
Laplace equation may be satisfied by a sum of partial solutions ^p)/{(p) if the angular components of 
the products satisfy the boundary conditions on the comer sides. Taking (just for the simplicity of 
notation) the conductor’s potential to be zero, and one of the comer’s sides as axis x (q> = 0), these 
boundary conditions are 

ftO) = ?(JZ) = 0, (2.119) 

where angle J3 may be anywhere between 0 and 2 zr (Fig. 14). 

(a) (b) 


Fig. 2.14. Cylindrical conductor 
cross-sections with (a) a comer 
and (b) a wedge. 
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Comparing this condition with Eq. (110), we see that it requires c v to vanish, and v to take one 
of the values of the following discrete spectrum: 

v m ={7ilp)rn, (2.120) 

with positive integer m. Hence the full solution of the Laplace equation takes the form 

<P = Y j a m p m ' l/3 $ in^^, for p<R, (2.121) 

m = 1 P 


where constants s v have been incorporated into a m . The set of constants a,„ cannot be simply detennined, 
because it depends on the exact shape of the conductor outside the corner, and the externally applied 
electric field. However, whatever the set is, in the limit p — > 0, solution (121) is almost 39 always 
dominated by the term with lowest v (corresponding to m = 1) , 


</> 


n! B ■ n 
a x p ^ sin — (p . 


(2.122) 


because the higher terms go to zero faster. This potential distribution corresponds to the surface charge 
density 


cj s 0 E n sur f ace s () 


d(j) 


d{pcp) 


m. 


— — s ■ 

/9=const, (p — ^+0 0 


P 


P 


\X 


Ip- 1 ) 


(2.123) 


(It is similar on the opposite face of the angle.) 

Equation (123) shows that if we are dealing with a usual, concave corner (/? < re, see Fig. 14a), 
the charge density (and the surface electric field) tends to zero. On the other case, at a “convex corner” 
with P > ^(actually, a wedge - see Fig. 14b), both charge and field concentrate, formally diverging at p 
— > 0. (So, do not sit on a roofs ridge during a thunderstorm; rather hide in a ditch!) We already saw 
qualitatively similar effects at our analyses of the thin round disk and split plane in the past section. 

(iii) Cylindrical coordinates. Now, let us discuss whether it is possible to generalize our 
approach to problems whose geometry is still axially-symmetric, but with a substantial dependence of 
the potential on the axial coordinate (d(j)ld z ^ 0). The classical example of such a problem is shown in 
Fig. 15. Here the side wall and the bottom lid of a round cylinder are kept at fixed potential (say, (j) = 0), 
but the potential V fixed at the top lid is different. This problem is qualitatively similar to the rectangular 
box problem solved above (Fig. 11), and we will also try to solve it for the case of arbitrary voltage 
distribution over the top lid: V= V(p, cp). 



Fig. 2.15. Round cylinder 
with conducting walls. 


39 Exceptions are possible only for highly symmetric configurations when the external field are crafted to make a\ 
= 0. In this case the solution is led by the first nonvanishing term of the series (121). 
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Following the main idea of the variable separation method, let us require that each partial 
function ^ in Eq. (84) satisfies the Laplace equation, now in full cylindrical coordinates \p, cp, z} : 40 


±JL 

P dp 


P 


djh_ 

dp 


i dV, , d% 


+ ■ 


p dcp dz z 


= 0 . 


(2.124) 


Plugging in <j)k in the form p)f{ (p)Z(z) into Eq. (124) and dividing both parts by product PfZ, we get 


1 d 


p'K dp 


P 


d^_ 

dp 


1 d 2 f | 1 d 2 Z _ 


p f dcp Z dz~ 


(2.125) 


Since the first two terms of Eq. (125) can only depend on polar variables p and cp, while the third tenn, 
only on z, at least that term should be a constant. Denoting it (just like in the rectangular box problem) 
by f, we get, instead of Eq. (125), a set of two equations: 


d 2 Z 

dz 2 



(2.126) 


_L d_ 

pp dp 


f dt?) 

p~r 
\ dp) 


+r 


i d 2 f 

p 2 ?d(p 2 


= 0 . 


(2.127) 


Now, multiplying all the terms of Eq. (127) by p 2 , we see that the last term, (d 2 i l ld(/f)lf, may depend 

only on cp, and thus should be constant. Calling that constant V (as in Sec. (ii) above), we separate Eq. 
(127) into an angular equation, 


d 2 ? 

— T + v ? = 0, (2.128) 

dcp 


and a radial equation: 


d 2 ^ 1 d% , 2 

— y + + (^" 

dp~ p dp 


v 2 

r K = 0. 

P 


(2.129) 


We see that the ordinary differential equations for functions Z(z) and f(cp) (and hence their 
solutions) are identical to those discussed earlier in this section. However, Eq. (129) for the radial 
function p(p) (called the Bessel equation) is more complex than in the 2D case, and depends on two 
independent constant parameters, y and v. The latter challenge may be readily overcome if we notice 
that any change of y may be reduced to re-scaling the radial coordinate p. Indeed, introducing a 
dimensionless variable E, = ypd 1 Eq. (129) may be reduced to an with one parameter, v. 


Bessel 

equation 



|j-4] 

o 

II 

(V 

l # 2 J 



(2.130) 


40 See, e.g., MA Eq. (10.3). 

41 Please note that this normalization is specific for each value of the variable separation parameter y. Also, note 
that the normalization is meaningless for y = 0, i.e. for the case Z(z) = const. However, if we need partial 
solutions for this value of y, we can use Eqs. (108)-(109). 
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Moreover, we already kn ow that for angle-periodic problems the spectrum of eigenvalues of Eq. (128) is 
discrete v=n. 

Unfortunately, even in this case, Eq. (130) cannot be satisfied by a single “elementary” function, 
and is the canonical form of an equation defining the Bessel function of the first kind, of order v, 
commonly denoted as Jdf)- Let me review in brief the Bessel function properties most relevant for the 
boundary problems of physics - and some other problems discussed in these notes. 42 

First of all, the Bessel function of a negative integer order is very simply related to that with the 
positive order: 

= (2.i3i) 

enabling us to limit our discussion to the functions with n > 0. Figure 16 shows four functions with a 
few lowest positive n. 



Fig. 2.16. Several first-kind Bessel 
functions J„(g) of integer order. 
Dashed lines show the envelope of 
asymptotes (135). 


As argument x is increased, each function is initially close to a power law: Jo(f) ~ 1, J\(f) ~ £/2 
= Jiff) ~ c 2/ 8, etc. This behavior follows from the Taylor series 


JM) 



I 


(-0* re 


2k 


k\(n + k)\ 




(2.132) 


which that is formally valid for any and may even serve as an alternative definition of function J„(c). 
However, this series is converging fast only at relatively small arguments, £ <n, where its main tenn is 


42 For a more complete discussion of these functions, see the literature listed in MA Sec. 16, for example, Chapter 
6 (written by P. Davis) in the collection compiled and edited by Abramowitz and Stegun. 
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T 

n'\2j 


I/O 

At < ^ n + 1.86/7 , the Bessel function reaches its maximum 43 

0.675 


max ; [./„ (g)] 


1/3 ’ 


(2.133) 


(2.134) 


and then starts to oscillate with a period that gradually approaches 2n, a phase shift that increases by zr/2 

1 /9 

with each unit increment of n, and an amplitude that decreases as . These features are described by 
the following asymptotic formula 


JniS) 


r 2 V /2 




it nn 

cos(£ ---—), 


for E, / n 


oo. 


(2.135) 


that starts to give reasonable results very soon above the function peaks - see Fig. 16. 44 

Now we are ready to return to our case study (Fig. 15). Let us select functions Z(z) to satisfy the 
bottom-lid boundary condition Z(0) = 0, i.e. proportional to sinh yz - cf. Eq. (95). Then 

^ = XX J «(rp)(%, cos n(p + s ny sin/7^)sinh y z . (2.136) 

n = 0 y 


Next, we need to satisfy the zero boundary condition at the cylinder’s side wall {p = R). This may be 
ensured by taking 

J n {yR) = 0. (2.137) 


Since each function J n (x) has an infinite number of positive zeros (see Fig. 16), which may be numbered 
by an integer index m = 1,2, . . ., Eq. (137) may be satisfied with an infinite number of discrete values of 
the separation parameter y. 

r„ =%-■ (2.138) 

K 


where % nm is the m - th zero of function J n (x) - see the top numbers in the cells of Table 1. (Very soon we 
will see what do we need the bottom numbers for.) 

Hence, Eq. (136) may be presented in a more explicit form: 

Variable 
separation in 
cylindrical 
coordinates 
(example) 




(j){p, <PA) = X X J " cos n(p + s nm sin n( p ) sinh — 

K V Rj 


n=0 m = 1 


(2.139) 


43 These two formulas for the Bessel function peak are strictly valid for n » 1, but may be used for reasonable 
estimates starting already from n = 1; for example, max^ [J\(c)} is close to 0.58 and is reached at c ~ 2.4, just 
about 30% away from the values given by the asymptotic formulas. 

44 Eq. (135) and Fig. 16 clearly show the close analogy between the Bessel functions and the usual trigonometric 
functions, sine and cosine. In order to emphasize this similarity, and help the reader to develop more gut feeling 
of the Bessel functions, let me mention one fact of the elasticity theory: while sine functions describe, in 
particular, possible modes of standing waves on a guitar string, functions J„(c) describe, in particular, possible 
standing waves on an elastic round membrane, with describing their lowest (fundamental) mode. 
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Here coefficients c nm and s nm have to be selected to satisfy the only remaining boundary condition - that 
on the top lid: 

/V ^ • , ( - 


V(p,q>) = <Kp,(p,L) = YSL J n^n,n cos n p + s nm sin /^)sinh| 4„, R 


n = 0 m = 1 


(2.140) 


To use it, let us multiply both parts of Eq. (140) by J n (^ nm ’p/R) cos n \p , integrate the result over the lid 
area, and use the following property of the Bessel functions: 


1 1 
f J (<E s)j (<E ,s)sds = — \j , (E )Y S ,, 

J n\3nm J n\3 nm r r* L n+\ nm / J mm ' ’ 


(2.141) 


where 8 mm ■ is the Kronecker symbol. 


45 


Table 2.1. Approximate values of a few first zeros of a few lowest-order Bessel functions J„(f) (the 
top number in each cell), and the values of dJfd^a t those points (the bottom number in the cell). 



m = 1 

2 

3 

4 

5 

6 

n = 0 

2.40482 

-0.51914 

5.52008 

+0.34026 

8.65372 

-0.27145 

11.79215 

+0.23245 

14.93091 

-0.20654 

18.07106 

+0.18773 

1 

3.83171 

-0.40276 

7.01559 

+0.30012 

10.17347 

-0.24970 

13.32369 

+0.21836 

16.47063 

-0.19647 

19.61586 

+0.18006 

2 

5.13562 

-0.33967 

8.41724 

+0.27138 

11.61984 

-0.23244 

14.79595 

+0.20654 

17.95982 

-0.18773 

21.11700 

+0.17326 

3 

6.38016 

-0.29827 

9.76102 

+0.24942 

13.01520 

-0.21828 

16.22347 

+0.19644 

19.40942 

-0.18005 

22.58273 

+0.16718 

4 

7.58834 

-0.26836 

11.06471 

+0.23188 

14.37254 

-0.20636 

17.61597 

+0.18766 

20.82693 

-0.17323 

24.01902 

+0.16168 

5 

8.77148 

-0.24543 

12.33860 

+0.21743 

15.70017 

-0.19615 

18.98013 

+0.17993 

22.21780 

-0.16712 

25.43034 

+0.15669 


Relation (141) expresses a very specific (“2D”) orthogonality of Bessel functions with different 
indices m - do not confuse them with the function’s order n, please! 46 Since it relates two Bessel 
functions with the same index n, it is natural to ask why its right-hand part contains the function with a 
different index (n + 1). Some clue may come from one more very important property of the Bessel 
functions, the so-called recurrence relations : 47 


45 Let me hope the reader knows what it is; if not - see MA Eq. (13.1). 

46 The Bessel functions of the same argument but of different orders are also orthogonal, but in a different way: 

) J M) J AS) f 
o 4 n + n 

47 These relations provide, in particular, a convenient way for fast numerical computation of all J n (f) after Jff) 
has been computed. (The latter is usually done with an algorithm using Eq. (132) for smaller c and an extension 
of Eq. (135) for larger f ) Note that most mathematical software packages, including all those listed in MA Sec. 
16(iv), include ready subroutines for calculation of functions ./„( q) and other special functions used in this lecture 
series. In this sense, the line separated these “special functions” from “elementary functions” is rather blurry. 
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4,-,(#> + -C,( = (2.142a) 

£ 

= (2.142b) 

dg 

that in particular yield the following relation (convenient for working out some Bessel function integrals): 

A(fV„ (0)=fV„.,«). (2-143) 

dg 

For our current purposes, let us apply the recurrence relations at special points £ nm . At these 
points, J n vanishes, and the system of two equations (142) may be readily solved to get, in particular, 

4,40 = -^r (2-144) 

dg 

so that the square bracket in the right-hand part of Eq. (141) is just {dJJd^f at g = g nm . Thus the values 
of the Bessel function derivatives at the zero points (given by the lower numbers in the cells of Table 1) 
are as important for boundary problem solutions as the zeros themselves. 

Since the angular functions cos ncp are also orthogonal - both to each other, 

2 ft 

J cos(n<p)cos(n '<p) dcp =ftS nn , , (2.145) 

o 


and to all functions sin mp, the integration over the lid area kills all terms of both series in right-hand 
part of Eq. (140), besides just one tenn proportional to and hence gives an explicit expression for 
that coefficient. The counterpart coefficients s n > m • may be found by repeating the same procedure with 
the replacement of cos n \p by sin n dp. This evaluation (left for reader’s exercise) completes the solution 
of our problem for an arbitrary lid potential V{p,cp). 

Still, before leaving the Bessel functions (for a while :-), we need to address two important 
issues. First, we have seen that in our cylinder problem (Fig. 15), the set of functions J n (% nm p/R) with 
different indices m (that characterize the degree of Bessel function’s stretch along axis p) play the role 
similar to that of functions sin (mix! a) in the rectangular box problem shown in Fig. 1 1. In this context, 
what is the analog of functions cos (mx/a) - which may be important for some boundary problems? In a 
more fonnal language, are there any functions of the same argument £ = g nm p/R, that would be linearly 
independent of the Bessel functions of the first kind, while satisfying the same differential equation 
(130)? 

The answer is yes. For the definition of such functions, we first need to generalize our prior 
formulas for J n (£), and in particular Eq. (132), to the case of arbitrary order v . The generalization may 
be performed in the following way: 


Jy(£) = 



CO 


z 


(- 1 )* 


k\T(v + k + \) 


i 

V 2y 


(2.146) 
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where f(.s) is the so-called gamma function that may be defined, for almost any real 5 , as 48 


r(s) = . 

0 


(2.147) 


The simplest, and the most important property of the gamma function is that for integer values of 
argument it gives the factorial of a number smaller by one: 

r(n + 1) = n! = 1 • 2- ...n , (2.148) 


so it is essentially a generalization of the notion of factorial to all real numbers. 

The Bessel functions defined by Eq. (146) satisfy (after replacements n — > vand n\ — » Y(n + 1)), 
virtually all the relations we have discussed above, including the Bessel equation (130), the asymptotic 
formula (135), the orthogonality condition (141), and the recurrence relations (142). Moreover, it may 
be shown that v ^ n, functions .7,(2) and J. f c) are linearly independent and hence their linear 
combination may be used to present a general solution of the Bessel equation. Unfortunately, as Eq. 
(131) shows, for v=n this is not true, and a solution independent of J„( c) has to be fonned in a different 
way. 

The most common way of overcoming this difficulty is first to define, for all v ^n, function 


Y ^ _ f.(£)cosvx- J_ v (g) 

sin vn 


(2.149) 


called the Bessel function of second kind, or more often as the Weber functions* 9 and then to follow the 
limit v — > n. At this, both the nominator and denominator of the right-hand part of Eq. (149) tend to 
zero, but their ratio tends to a finite value called Y n (x). It may be shown that these functions are still the 
solutions of the Bessel equation and are linearly independent of J„(x), though are related just as those 
functions if the sign of n changes: 

Y_ n (Z) = (-iyY„(&. (2.150) 


Figure 17 shows a few Weber functions of the lowest integer orders. The plots show that the 
asymptotic behavior is very much similar to that of 




r 2 \ 1/2 




sin(2j ~~f~ —f~)’ for ^ 


' 00 . 


(2.151) 


but with the phase shift necessary to make these Bessel functions orthogonal to those of the fist order - 
cf. Eq. (135). However, for small values of argument f the Bessel functions of the second kind behave 
completely differently from those of the first kind: 




f( 2 M[ln(£/2)+y], 

(»-i)!frr 


n 


for n = 0, 
for n ^ 0, 


(2.152) 


48 See, e.g., MA Eq. (6.7a). 1 used word “almost” because the gamma- function tends to infinity at all non-positive 
integer values of its argument (s = 0, -1. -2, . . .). 

49 They are also sometimes called the Neumann functions, and denoted as 
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where 


y = lim 


1 H 1 f- ... -t In n 

v 2 3 n 


0.577157... 


(2.153) 


is the so-called Euler constant. Relations (152) and Fig. 17 show that functions Y n ( £) diverge at £ — > 0 
and hence cannot describe the behavior of any physical variable, in particular the electrostatic potential. 



Fig. 2.17. A few Bessel functions of the 
second kind (a.k.a. the Neumann 
functions, a.k.a. the Weber functions). 


One may wonder: if this is true, when do we need these functions in physics? This does not 
happen too often, but still does. Figure 18 shows an example of a boundary problem of electrostatics 
that requires both functions J n (%) and Y„( c , ). 




Fig. 2.18. Simple boundary 
problem that cannot be solved 
using just one kind of Bessel 
functions. 


Two round, coaxial conducting cylinders are kept at the same (say, zero) potential, but at least 
one of two horizontal lids has a different potential. The problem is almost completely similar to that 
discussed above (Fig. 15), but now we need to find the potential distribution in the free space between 
the cylinders, R\ < p < R 2 . If we use the same variable separation as in the simpler counterpart problem, 
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we need the radial functions ^ p ) to satisfy two zero boundary conditions: at p = R\ and p = R 2 . With 
the Bessel functions of just first kind, J„(yp), it is impossible to do, because the two boundaries would 
impose two independent (and generally incompatible) conditions, JfyRf =0, and J n {yRi) =0, for one 
“compression parameter” y. The existence of the Bessel functions of the second kind immediately saves 
the day, because if a solution is presented as a linear combination, 50 

c j J n (yp) + c Y V n (yp), (2.154) 

two zero boundary conditions give two equations for y and ratio c = c Y tcj. (Due to the oscillating 
character of both Bessel functions, these conditions would be typically satisfied by an infinite set of 
discrete pairs {y, c}.) Note, however, that generally none of these pairs would correspond to zeros of 
either J n nor Y„, so that having an analog of Table 1 for the latter function would not help much. Hence, 
even the simple problems of this kind (like the one shown in Fig. 18) typically require numerical 
solutions of algebraic (transcendental) equations. 

One more issue we need to address, before moving on to the spherical coordinates, are the so- 
called modified Bessel functions', of the first kind, f(f), and of the second kind, K ■,(£)■ They are two 
linearly-independent solutions of the modified Bessel equation. 

Modified 

(2.155) Bessel 

equation 


that differs from Eq. (130) “only” by the sign of one of its terms. Figure 19 shows a simple problem that 
leads to this equation: a round conducting cylinder is sliced, perpendicular to its axis, to rings of equal 
height h, which are kept at equal but sign- alternating potentials. 



z 


A 



(j) = +V / 2 
<j) = -VI2 
(j) - +V / 2 


Fig. 2.19. Typical boundary problem whose 
solution may be conveniently described in 
terms of the modified Bessel functions. 


If the gaps between the sections are narrow, t « h, we may use the variable separation method 
for the solution to this problem, but now we evidently need periodic (rather than exponential) solutions 


50 A pair of independent linear functions, used for presentation of the general solution of the Bessel equation, may 
be also chosen in a different way, using the so-called Hankel functions 

H I u >(fW„(£)±<T„(f). 

For representing the general solution of Eq. (130), this alternative is completely similar to using the pair of 
complex functions exp {±iax\ = cos ax ± /sin ax instead of the pair of real functions {cos ax, sin ax} for 
representing the general solution of Eq. (89) forXfy). 
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along axis z, i.e. linear combinations of sin kz and cos kz with various real values of constant k. 
Separating the variables, we arrive at a differential equation similar to Eq. (129), but with the negative 
sign before the separation constant: 


d 2 P | I d'K 
dp 2 p dp 


i v 2 

{k 2 +—)K = 0. 
P 


(2.156) 


Radial coordinate normalization, <^ = kp, immediately leads us to Eq. (155), and hence (for v= n) to the 
modified Bessel functions Iff) and Kfc). 

Figure 19 shows the behavior of a few such functions, of a few lowest orders. One can see that at 
^ — > 0 it is virtually similar to that of the “usual” Bessel functions - cf. Eqs. (132) and (152), with K„(g) 
multiplied (due to purely historical reasons) by an additional coefficient, 7r/2: 


i (fY 

n\\2) 


KM) 


f 


In 


£ 


+ r 




in- 1)\U 


f ?\~ n 


for n = 0, 


for n ^ 0, 




(2.157) 


However, the asymptotic behavior of the modified functions is very much different, with I n (x) 
exponentially growing and K n ( c) exponentially dropping at oo: 


hit) 


f x A 1/2 


, 1/2 


2^. 


KM) 


7t 






(2.158) 




£ # 


Fig. 2.20. Modified Bessel 
functions of the first kind (left 
panel) and the second kind 
(right panel). 


To complete our brief survey of the Bessel functions, let me note that all the functions we have 
discussed so far may be considered as particular cases of Bessel functions of the complex argument, say 
J„(f) ar >d V,M), or, alternatively, //„ (I ' 2 W) = d,M) ± iY„(<z). 51 The “usual” Bessel functions J n (f) and 


51 These complex functions still obey the general relations (143) and (146), with £ replaced with -z. 
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Y„(g) may be considered as a set of values of these generalized functions on the real axis (z = £), while 
the modified functions as their particular case at z = ic: 


i v {$) = r v JM\ 




(2.159) 


Moreover, this generalization of the Bessel functions to the whole complex plane z enables the use of 
their values along other directions on that plane, for example under angles n!A ± nil. As a result, one 
arrives at the so-called Kelvin functions 


ber^ + /bei^ = J,«e-“ /4 ), 
ker„ 4 + i kei„<? = i^H?^ 3 ’' 14 ), 


which are also useful for some important problems of mathematical physics and engineering. 
Unfortunately, we do not have time to discuss these problems in this course. 52 

(iv) Spherical coordinates are very important in physics, because of the (approximate) spherical 
symmetry of many objects - from electrons and nuclei and atoms to planets and stars. Let us again 
require each component A of Eq. (84) to satisfy the Laplace equation. Using the well known expression 
for the Laplace operator in spherical coordinates, 53 we get 


_LA 

r 1 dr 




V 


dr 


+ - 


1 


sin 9 dO 


sin# 


V 


$A 

d6 


+ - 


_i dfl 

sin 2 9 dip 1 


k = 0 . 


(2.161) 


Let us look for a solution of this equation in the following variable-separated form: 

A = ^^(cos 9) flip), 
r 


(2.162) 


Separating equations one by one, just like this has been done in cylindrical coordinates, we get the 
following equations for the functions participating in this solution: 


A'K 

dr 2 


/(/ + 1) 


^ = 0, 


(2.163) 


d 

d% 



+ 

v 2 

1(1 + 1)-— T 

d%\ 




iff 

dip 1 


+ vV = 0, 


(2.164) 

(2.165) 


where £, = cos# is a new variable in lieu of #(so that -I < c< +1), and v and /(/+1) are the separation 
constants. (The reason for selection of the latter one in this form will be clear in a minute.) One can see 
that, in contrast with the cylindrical coordinates, the equation for the radial functions is quite simple. 


52 Later in the course we will also run into the so-called spherical Bessel functions j n (f) and y„(4), which may be 
expressed via the Bessel functions of a semi-integer order. Surprisingly enough, the spherical Bessel functions 
turn out to be much simpler than J n (f) and Y n (f). 

53 See, e.g., MA Eq. (10.9). 
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Legendre 

equation 

and 

polynomials 


Indeed, let us look for its solution in the form cr a - just as we have done with Eq. (106). Plugging this 
solution into Eq. (163), we immediately get the following condition on parameter a : 

a(a-l) = /(/ + l). (2.166) 

This quadratic equation has two roots, a = l + 1 and a = - 1, so that the general solution to Eq. (163) is 

5? = a,r M + . (2.167) 

r 

Equation (165) is also very simple, and to some extent similar to Eq. (108) for the cylindrical 
coordinates. However, Eq. (164) function ?{%), where E, is the cosine of the polar angle 6, is the so- 
called Legendre differential equation, whose solution cannot be expressed via what is usually called 
“elementary functions” - though, again, there is no generally accepted line between them and “special 
functions”. 


Let us start with axially -symmetric problems for which d(j)!d(p =0. This means fi,(p) = const, and 
thus v = 0, so that Eq. (164) is reduced to so-called Legendre ’s ordinary differential equation : 


(2.168) 


One can readily check that the solutions of this equation for integer values of / are specific {Legendre) 
polynomials 54 that may be defined, for example, by the following Rodrigues ’ formula : 




(2.169) 


As follows from this formula, the first few Legendre polynomials are pretty simple: 


n(f) = i. 

(<?)=<?, 

n(£)=i(3<r ! -ii (2.170) 

P,«) = i(5f 3 -3d 

P.(f) = I( 35£ 4 -30# 2 +3).„ 

O 


though such explicit expressions become more and more bulky as / is increased. As Fig. 21 shows, all 
these functions, that are defined on the [-1, +1] segment, start at one point, 'Pff 1) = + 1, and end up 
either at the same point or in the opposite point: 73(-l) = (-1 /. On the way between these two end points, 
the /- th polynomial crosses the horizontal axis / times. It is straightforward to use Eq. (169) for proving 
that these polynomials form a full, orthogonal set of functions, with the following normalization rule: 


54 Just for reader’s reference: if / is not integer, the general solution of Eq. (2.168) may be represented as a linear 
combination of the so-called Legendre functions (not polynomials!) of the first and second kind, fi( c) and Q{c). 
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(2.171) 

2 / + 1 

so that any function /(<%), defined on the segment [-1, +1], may be presented as a unique series over the 
polynomials. 55 



Fig. 2.21. A few lowest Legendre 
polynomials 'A/(4). 


Thus, taking into account the additional division by r in Eq. (162), the general solution of any 
axially-symmetric Laplace problem may be presented as 


(2.172) 



Please note a strong similarity between this solution and Eq. (112) for the 2D Laplace problem in polar 
coordinates. However, besides the difference in angular functions, there is also a difference (by one) in 
the power of the second radial function, and this difference immediately shows up in core problems. 


Indeed, let us solve a problem similar to that shown in Fig. 13: find the electric field around a 
conducting sphere of radius R, placed into an initially unifonn external field E 0 (whose direction we will 
take for axis z) - see Fig. 22a. If we select 0|z=o = 0, then ao= bo = 0. Now, just as has been argued for 
the cylindrical case, at r » R the potential should approach that for the uniform field: 


(j) — » -E 0 z = -E 0 r cos 8, 


(2.173) 


and this again means that in Eq. (172), only one of coefficients ai survives: a/ = -E 0 Sn. Now, and from 
the boundary condition on the surface, <fi(R,6) = 0, we get: 


0 = 


~E 0 R + 


A 

R : 


cos # + ^ E, (cos 6) . 


l> 2 


R 


(2.174) 


This expression may be viewed as the expansion of function f{E) = 0 into a series of orthogonal 
functions 73(4). Since such expansions are unique, and Eq. (174) is satisfied if 


55 As a result, there is not practical sense, at least for the purposes of this course, in pursuing (more complex) 
solutions to Eq. (168) for non- integer values of /. 


Variable 
separation 
in spherical 
coordinates 
(for axial 
symmetry) 
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b,=E 0 R\ 15 


(2.175) 


this is indeed the only possibility to satisfy the boundary condition, so that, finally, 



(2.176) 


Fig. 2.22. Conducting sphere in a uniform electric field: (a) problem’ geometry, and (b) the 
equipotential surface pattern given by Eq. (176). The pattern is qualitatively similar but quantitatively 
different from that for the conducting cylinder in a perpendicular field - cf. Fig. 13. 


This distribution, shown in Fig. 22b, is very much similar to Eq. (117) for the cylindrical case, 
but with a different power of radius in the second term. This leads to a quantitatively different 
distribution of the surface electric field: 

|„«=3£,cose, (2.177) 

or 

so that its maximal value is a factor of 3 (rather than 2) larger than the external field. 

Now let us discuss the Laplace equation solution in the general case (no axial symmetry), but 
only for most important systems in which the free space surrounds the origin from all sides. In this case 
the solutions to Eq. (165) have to be 2 ^--periodic, and hence v= n = 0, ±1, ±2,... Mathematics says that 
the Legendre equation (164) with integer v = n and a fixed integer / has a solution only for a limited 
range of n: 56 

-l<n<+l. (2.178) 

These solutions are called the associated Legendre functions. For n > 0, they may be defined via the 
Legendre polynomials using the following formula: 


56 In quantum mechanics, letter n is typically reserved used for the “main quantum number”, while the azimuthal 
functions are numbered by index m. However, I will keep using n as their index, because for this course’s 
purposes, this seems more logical in the view of the similarity of the spherical and cylindrical functions. 
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^"(£) = (-l)"(l-£ 2 )" /2 


d n 

cU; n 


^(£). 


(2.179) 


On the segment d, e [-1, +1], each set of the associated Legendre functions with a fixed index n and non- 
negative / form a full, orthogonal set, with the normalization relation, 

V 2 (1 + nV 

f P," (ftPd (4)d£ = —8 lv , (2.180) 

\ 1 b w 2/ + 1 (/-«)! " 

that is evidently a generalization of Eq. (171). 

Since these relations may seem a bit intimidating, let me write down explicit expressions for a 
few P n l ( cos#) with the lowest values of / and n > 0: 


1 = 0: 

Pq (cos 0)=\\ 

(2.181) 

1 = 1 : 

1 7 3 1 0 (cos6 , ) = cos 6, 
\p^ (cos 0) = - sin 6\ 

(2.182) 


P^lzosO) = (3 cos 2 6 - 1)/ 2, 


1 = 2:- 

P\ (cos 6) = -2 sin 6 cos 0, 

(2.183) 


77, 2 (cos£) = -3 cos 1 6 . 


The reader should agree there is not much intimidation is these functions - which are most important for 
applications. 


Now the general solution (162) to the Laplace equation in the spherical coordinates may be 
spelled out as 


(2.184) 



Since the difference between angles 6 and (p is somewhat artificial, physicists prefer to think not about 
functions Pand f in separation, but directly about their products that participate in this solution. Figure 
23 shows a few such angular functions 57 by plotting their modulus along the radius, and using bi-color 
to show the function sign. While the lowest function (/ = 0, n = 0) is just a constant, two “dipole” 
functions (/ = 1) differ from each other by their spatial orientation. Functions with higher / (say, 1 = 2) 
differ more substantially, with the following general trend: for each value of /, the function with n = 0 is 


57 In quantum mechanics, it is more convenient to use a slightly different set of basic functions, namely complex 
functions called spherical harmonics, 


Y;\e,cp) = 


21 + 1 (/-«)! 


1/2 


P" (cos 6)e 


incp 


4 n (/ + «)! 

which are defined for both positive and negative n (within the limits -/ <n< +1), because they form a full set of 
orthonormal eigenfunctions of angular momentum operators L 1 and L : - see, e.g., QM Secs. 3.6 and 5.6. 


Variable 

separation 

in spherical 

coordinates 

(general 

case) 
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axially-symmetric 58 and has / zeros on its way from 6 = 0 to 6 = n, while the functions with n = / do not 
have zeros inside that interval, while oscillating most strongly as functions of (p . 


1 = 0 : 


1 = 1 : 


1 = 2 : 



Fig. 2.23. Several products P" (cos 0)f n {(p) 

with the lowest values of positive l and n. Color 
shows function’s sign, while distance from the 
origin, its magnitude. (Adapted from Web site 
http://people.csail.mit.edu/sparis/sh/) . 



n = 0 n = 1 



n = 0 


n = 1 


77 = 2 


As an exception, in order to save time, I will skip an example of application of the associated 
Legendre functions, because several such examples are given in the quantum mechanics part of these 
series. (Note that in this field, index n is traditionally called m - the magnetic quantum number .) 


2.6. Charge images 

So far, we have discussed various methods of solution of the Laplace boundary problem (35). 
Let us now move on to the discussion of its generalization, the Poisson equation (1.41), that we need 
when besides the conductors, we also have “free” charges with a known spatial distribution p{ r). (This 
will also allow us, better equipped, to revisit the Laplace problem again in the next section.) 


Let us start with a somewhat limited, but sometimes very useful charge image (or “image 
charge”) method. Consider a very simple problem: a single point charge near a conducting half-space - 
see Fig. 24. Let us prove that its solution, above conductor’s surface (z > 0), may be presented as: 


<p{r) = 


1 

r 

q 

q 

q 

r i 

1 1 

4 7ts 0 

u 

r i) 

4?T£ 0 

Jr — r '| 

r-r" y 


(2.185) 


58 According to Eq. (179), these functions involve only the Legendre polynomials Pi = P/° . 


Chapter 2 


Page 45 of 64 








Essential Graduate Physics 


EM: Classical Electrodynamics 


or in a more explicit (coordinate) form: 


^(r) 


q 

f 

1 


1 

\ 

4 ns 0 

J/r + (z - d ) 2 

1,2 

P~ +( z + d) 2 

1/2 

J 


(2.186) 


where p is the distance of the observation point from the vertical line on which the charge is located. 
Indeed, this solution evidently satisfies both the boundary condition of zero potential at the surface of 
the conductor (z = 0), and the Poisson equation (1.41), with the single ^-functional source at point r ’ = 
{0, 0, d } in its right-hand part, because its another singularity, at point r” = {0, 0, -d}, is outside the 
region of validity of this solution (z > 0). 



Fig. 2.24. The simplest problem readily solvable by the 
charge image method. Point colors in this section are 
used, here and in the balance of this section, to denote 
charges of the original (red) and opposite (blue) sign. 


Physically, the solution may be interpreted as the sum of the fields of the actual charge ( +q ) at 
point r ’ , and an equal but opposite charge (- q ) at the “mirror image” point r ” (Fig. 24). This is the basic 
idea of the charge image method. Before moving to more complex problems, let us discuss the situation 
shown in Fig. 24 in a little bit more detail. First, we can use Eqs. (3) and (186) to calculate the surface 
charge density: 


cr = -a. 


3 (j) | 
3 z 


q 3 
4 n 3z 


\p 2 +(z-d) 2 \ 2 \p 2 +(z + d) 2 ]' 2 


q 


2d 


J z=0 


4 *(p 2 + dT 


(2.187) 


The total surface charge is 

oo oo 

Q = jad 2 r = 2xjcr(p)pdp = -P 




Ipdp . 


(2.188) 


This integral may be easily taken using the substitution E, = (fid 1 (giving <7c = 2/xlpld 2 ): 


e=-ff 


q ] d% 

2 J o(^ + l) ; 


= ~q- 


(2.189) 


This result is very natural, because the conductor “wants” to bring as much surface charge from its 
interior to the surface as necessary to fully compensate the initial charge ( +q ) and hence to kill the 
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electric field at large distances as efficiently as possible, hence reducing the total electrostatic energy 
(1.67) to the lowest possible value. 

For a deeper understanding of this polarization charge of the surface, let us take our calculations 
to the extreme - to q equal to one elementary change e, and place a particle with this charge (for 
example, a proton) at a macroscopic distance - say 1 m - from conductor’s surface. Then, according to 
Eq. (189), the total polarization charge of the surface equals to that of an electron, and according to Eq. 
(187), its spatial extent is of the order of af = 1 itT. This means that if we consider a much smaller part 
of the surface, A A « d , its polarization charge magnitude A Q = crAA is much less than one electron\ 
For example, Eq. (187) shows that the polarization charge of quite a macroscopic area AA = 1 cm right 
under the initial charge (p = 0) is eAAUnd 2 ~ 1.6xl0' 5 e. Can this be true, or our theory is somehow 
limited to the charges much larger than el 

Surprisingly enough, the answer to this question has become clear (at least to some physicists :-) 
only as late as in the mid-1980s when several experiments demonstrated, and theorists accepted, some 
rather grudgingly that the usual polarization charge formulas are valid for elementary charges q as well, 
i.e., such the polarization charge AQ of a macroscopic surface area can indeed be less than e. The 
underlying reason for this paradox is the nature of the polarization charge of the conductor surface: as 
should be clear from our discussion in Sec. 1, it is due not to new charged particles brought into the 
conductor (such charge would be in fact quantized in the units of e), but to a small shift of the free 
charges of a conductor by a very small distance from their equilibrium positions that they had in the 
absence of the external field induced by charge q. This shift is not quantized, at least on the scale 
relevant for our issue, and neither is A Q. This understanding has opened a way toward the invention and 
experimental demonstration of several new devices including so-called single-electron transistors , 59 
which may be, in particular, used to measure polarization charges as small as ~10‘ 6 e. 

To complete the discussion of our initial problem (Fig. 24), let us find the potential energy U of 
the charge-to-surface interaction. For that we may use the value of the electrostatic potential (185) in the 
point of the charge itself (r = r ’), of course ignoring the infinite potential created by the charge itself, so 
that the remaining potential is that of the image charge 


$W( r ') 


_}_ q_ 

4ns 0 2d 


(2.190) 


Looking at the definition of the electrostatic potential, given by Eq. (1.31), it may be tempting to 
immediately write U= qf im a ge = - {\l4nsf)(q I2d) [WRONG!], but this would not be correct. The reason 
is that potential $ mage is not independent of q, but is actually induced by this charge. This is why the 
correct approach is to use Eq. (1.63), with just one term: 


1 


1 




U = —qd) = 

2 y ^ image 4ns n 4 d 


(2.191) 


twice lower in magnitude than the wrong result cited above. In order to double-check this result, and 
also get a better feeling of the factor Vi that distinguishes it from the wrong guess, we can recalculate 


59 Actually, this term (for which the author of these notes should be blamed :-) is misleading: operation of the 
“single-electron transistor” is based on the interplay of discrete charges (multiples of e ) transferred between 
conductors, and .vu/?-singlc-clcctron polarization charges - see, e.g., K. K. Likharev, Proc. IEEE 87, 606 (1999). 
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energy XJ as the integral of the force exerted on the charge by the conductor (i.e., in our formalism, by 
the image charge): 


d -I 2 

U = - f F(z)dz = f — dz = - 

J 4**ol(2z) 2 


4 ns 0 4 d 


(2.192) 


This calculation clearly accounts for the gradual build-up of force F, as the real charge is brought from 
afar (where we have opted for U =0) toward the surface. 

This result, used for electrons, particles with charge q = -e, has several important applications. 
For example, let us plot energy U for an electron near a metallic surface, as a function of d. For that, we 
may use Eq. (192) until our macroscopic approximation (2) becomes invalid, and U transitions to some 
negative constant value (- <//) inside the conductor - see Fig. 25a. 



Fig. 2.25. (a) Origin of 
the workfunction and (b) 
the field emission of 
electrons (schematically). 


The positive constant y/ is called workfunction, because it describes how much work should be 
done on an electron to remove it from the conductor. As was discussed in Sec. 1, in good metals the 
electric field screening happens at interatomic distances ao ~ 10' 10 m. Plugging d = a and q = -e into Eq. 
(191), we get « 6x1 O' 19 J « 3.5 eV. This crude estimate is in a surprisingly good agreement with the 
experimental values of the workfunction, ranging between 4 and 5 eV for most metals. 60 

Next, let us consider the effect of an additional external electric field E 0 applied perpendicular to 
a metallic surface, on this potential profile. Assuming the field to be unifonn, we can add its potential - 
eEod at distance d from the surface, to that created by the image charge. (As we kn ow from Eq. (1.53), 
since field Eo is independent of the electron position, its recalculation to the potential energy does not 
require the coefficient 'A.) As the result, the potential energy of an electron near the surface becomes 

1 e 2 

U(d) = -eE (] d , for d>ao, (2.193) 

4 ns 0 4 d 

with a similar crossover to U = - y/ inside the conductor - see Fig. 25b. One can see that at the 
appropriate sign, and sufficient magnitude of the applied field, it lowers the potential barrier that 
prevented electron from leaving the conductor. At E 0 ~ y//ao this suppression becomes so strong that 
electrons just below the Fenni surface start quantum-mechanical tunneling through the remaining thin 


60 For more discussion of workfunction, and its effect on electron kinetics, see, e.g., SM Sec. 6.4. 
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barrier. This is the field emission effect, which is used in vacuum electronics to provide efficient 
cathodes that do not require heating to high temperatures. 61 

Returning to the basic electrostatics, let us consider some other geometries where the method of 
images may be effectively applied. First, let us consider a right comer (Fig. 26a). Reflecting the initial 
charge in the vertical plane we get the image charge shown in the top left comer of the panel, that makes 
the boundary condition (j) = const satisfied on the vertical surface of the comer. However, in order the 
same to be true on the horizontal surface, we have to reflect both the initial charge and the image charge 
in the horizontal plane, flipping their signs. The final configuration of 4 charges, shown in Fig. 26a, 
satisfies all the boundary conditions. The resulting potential distribution may be readily written as the 
evident generalization of Eq. (185). From there, the electric field and electric charge distributions, and 
the potential energy and forces acting on the charge may be calculated exactly as above. 



Fig. 2.26. Charge images for (a, b) internal comers with angles 7rand nil, (c) plane capacitor, and (d) 
rectangular box, and (d) equipotential surfaces for the last system. 


61 The practical development of such “cold” cathodes is strongly affected by the fact that any nanoscale surface 
irregularity (a protrusion, an atomic cluster, or even a single “adatom” stuck to the surface) may cause a strong 
increase of the local field well above the average applied field E 0 (see, for example our discussion in Sec. 4 
above), making the emission reproducibility an issue. 
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Next, consider a corner with angle ntA (Fig. 26b). Here we need to repeat the reflection operation 
not 2 but 4 times before we arrive at the final pattern of 8 positive and negative charges. (Any attempt to 
continue this process would lead to an overlap with the already existing charges.) This reasoning can be 
readily extended to any 2D comer with angle ft = ntn, with any integer n, that requires 2 n charges 
(including the initial one) to satisfy all the boundary conditions. 

Some configurations require an infinite number of images that are, however, tractable. The most 
important of them is a system of two parallel conducting surfaces, i.e. a plane capacitor of infinite area 
(Fig. 26c). Here the repeated reflection leads to an infinite system of charges ±q at points 

x*=±d + 2Dj (2.194) 


where 0 < d < D is the position of the initial charge and j an arbitrary integer. However, the resulting 
infinite series for the potential of the real charge q, created by the field of its images, 


<Kd) = 


1 

q | yy ±q 

q 

~ 1 

4 7I£ q 

2d U^\d-x + j\ 

4 7T£ 0 

2d 


D 3 ^j[j 2 -(d/D) 2 \ 


1 


(2.195) 


is converging (in its last form) very fast. For example, the exact value, </(D/ 2) = -2ln2(q/4ffGoD), differs 
by less than 5% from the approximation using just the first term of the sum. 

The same method may be applied to 2D (cylindrical) and 3D rectangular boxes that require, 
respectively, a 2D or 3D infinite lattices of images; for example in a 3D box with sides a, b, and c, 
charges ±q are located at points (Fig. 26d) 

r % = ±r' + 2 ja + 2 kb + 2 Ic , (2.196) 

where r ’ is the location of the initial (real) charge, and j, k, and / are arbitrary integers. Figure 26e shows 
the results of summation of the potentials of such charge set, including the real one, in a 2D box (within 
the plane of the real charge). One can see that the equipotential surfaces, concentric near the charge, are 
naturally leaning along the conducting walls of the box, which should be equipotential. 

Even more surprisingly, the image charge method works very efficiently not only for the 
rectilinear geometries, but also for spherical ones. Indeed, let us consider a point charge q at some 
distance d from the center of a conducting, grounded sphere of radius R (Fig. 27a), and try to satisfy the 
boundary condition <f>= 0 for the electrostatic potential on sphere’s surface using an imaginary charge q ' 
located at some point located beyond the surface, i.e. inside the sphere. 



(a) 


R 



(b) 


Fig. 2.27. Method of charge images for 
a conducting sphere: (a) the idea, and 
(b) the resulting potential distribution 
for particular case d = 2 R. 
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From problem’s symmetry, it is clear that the point should be at the line passing through the real 
charge and the sphere’s center, at some distance d’ from the center. Then the total potential created by 
the two charges at an arbitrary point with r > R (Fig. 27a) is 


</>(r,0) = 


q 


q 


4ns c 


~ - H i — 

(r 2 +d 2 -IrdcosO)' (r 2 +d ' 2 -Ird'cosd ) 1 


(2.197) 


It is easy to see that we can make the two fractions to be equal and opposite at all points on the sphere’s 
surface (i.e. for any 6* at r = R), if we take 62 


d' = 



q = 


R 

~d 


q. 


(2.198) 


Since the solution to any Poisson boundary problem is unique, Eqs. (197) and (198) give us such 
solution for this problem. Figure 27b shows a typical equipotential pattern calculated using Eqs. (197) 
and (198). It is surprising how formulas that simple may describe such a nontrivial field distribution. 

Now let us calculate the total charge Q induced by charge q on conducting sphere’s surface. We 
could do this, as we have done for the conducting plane, by the brute force integration of the surface 
charge density <j= -sodfidn r = r. It is more elegant, however, to use the following Gauss law argument. 
Expression (197) is valid (at r > R) regardless whether we are dealing with our real problem (charge q 
and the conducting sphere) or with the equivalent charge configuration (point charges q and q ’, with no 
sphere at all). Hence, according to Eq. (1.16), the Gaussian integral over a surface with radius r = R + 0, 
and the total charge inside the sphere should be also the same. Hence we immediately get 

Q = q' = ~q- (2-199) 

d 


The similar argumentation may be used to find the charge-to-sphere interaction force: 


F = q£» m J d ) = q- 
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q 


Rd 


2 \ 2 


4ns Ad - d’) 4ns 0 d (d - R / d) 4 ns 0 (d -R~) 


(2.200) 


(Note that this expression is legitimate only at d > R.) At large distances, d/R » 1, this attractive force 
decreases as 1/d 3 . This unusual dependence arises because, as Eq. (198) specifies, the induced charge of 
the sphere, responsible for the force, is not constant but decreases as Q oc 1 Id. 

All the previous formulas referred to a sphere that is grounded to keep its potential equal to zero. 
But what if we keep the sphere galvanically insulated, so that its net charge is fixed, e.g., equals zero? 
Instead of solving the problem from the scratch, let us use (again!) the linear superposition principle. For 
that, we may add to the previous problem an additional charge, equal to (-Q), to the sphere, and argue 
that this addition gives an additional potential that does not depend of the potential induced by charge q. 
For the interaction force, such addition yields 


62 In geometry, such points, with dd’ = R 2 , are referred to as the result of mutual inversion in a sphere of radius R. 
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F = 
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■ + ■ 


qQ 
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4ns 0 d" 


4 ns„ 


(d z -R 2 ) 2 d 


( 2 . 201 ) 


At large distances, the two terms proportional to 1 /d 3 cancel each other, giving F oc l/d 5 . Such a rapid 
force decay is due to the fact that the field of the uncharged sphere is equivalent to that of two (equal 
and opposite) induced charges +Q and - Q, and the distance between them (d’ = R Id) tends to zero at d 
— > oo. The potential energy of such interaction behaves as 17 oc l/d 6 at d — > oo; in the next chapter we will 
see that this is the general law of the induced dipole interaction. 


2.7. Green’s functions 

I have spent so much time/space discussing the potential distributions created by a single point 
charge in various conductor geometries, because, for any geometry, the generalization of these results to 
the arbitrary distribution p(r) of free charges is straightforward. Namely, if a single charge q, located at 
point r ’, created electrostatic potential 

</>(r) = -r^—qG(r,r'), (2.202) 

4ns 0 

then, due to the linear superposition principle, an arbitrary charge distribution creates potential 

Spatial 

(2.203) Green’s 

function 


Kernel G( r, r’) is called the (spatial) Green’s function - the notion very popular in all fields of 
physics. 63 Evidently, as Eq. (1.35) shows, in the unlimited free space 

G(r,r') = — (2.204) 

r -r I 

i.e. the Green’s function depends only on one scalar argument - the distance between the field 
observation point r and the field-source (charge) point r ’. However, as soon as there are conductors 
around, the situation changes. In this course we will only deal with Green’s functions that are defined in 
the space between conductors, and that vanish as soon as the radius-vector r points to the surface of any 
conductor: 64 

G(r,r')| rej4 =°. (2.205) 

With this definition, it is straightforward to deduce the Green’s functions for the solutions of the 
last section’s problems in which conductors were grounded (f) = 0). For example, for a semi-space z > 0 
limited by a conducting plane (Fig. 24), Eq. (185) yields 

G = t 7-t y, with p" = p' and z" = -z' . (2.206) 

r-rl r-r" 



63 See, e.g., CM Sec. 4.1, QM Secs. 2.2, 7.2 and 7.4, and SM Sec. 5.5. 

64 G so defined is sometimes called the Dirichlet function. 
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We see that in the presence of conductors (and, as we will see later, any other polarizable media), the 
Green’s function may depend not only on the difference r - r ’, but in a specific way from each of these 
two arguments. 

So far, this looked just like re-naming our old results. The really non-trivial result of the Green’s 
function application to electrostatics is that, somewhat counter-intuitively, the knowledge of the 
Green’s function for a system with grounded conductors (Fig. 28a) allows one to calculate the field 
created by voltage-biased conductors (Fig. 28b), with the same geometry. 

(a) (b) 




Fig. 2.28. Green’s function method allows the solution of a simpler boundary problem (a) to be used to find 
the solution of a more complex problem (b), for the same conductor geometry. 


In order to show that, let us use the so-called Green ’s theorem of the vector calculus. 65 The 
theorem states that for any two scalar, differentiable functions /(r) and g( r), and any volume V, 

J {fV 2 g-gV 2 f)d 3 r = j(fVg-gVf) n d 2 r, (2.207) 

v s 

where S is the surface limiting the volume. Applying the theorem to the electrostatic potential (p{r) and 
the Green’s function G (also considered as a function of r), let us use the Poisson equation (1.41) to 
replace V“0 with (-p/sd), and notice that G, considered as a function of r, obeys the Poisson equation 
with the ^-functional source: 

V 2 G(r,r') = -4^(r-r'). (2.208) 


(Indeed, according to its definition (202), this function may be formally considered as the field of a 
point charge q = 4 /raj.) Now swapping the notation of radius-vectors, r <-» r', and using the Green’s 
function symmetry, G( r, r ’) = G(r ’, r), 66 we get 




5G(r,r') 

dn' 


G(r,r') 


d(j){ r') 
dn' 


d 2 r' , 


(2.209) 


Let us apply this relation to volume V of free space between the conductors, and the boundary A 
slightly outside of their surface. In this case, by its definition, the Green’s function G(r, r’) vanishes at 
the conductor surface (r e S) - see Eq. (205). Now changing the sign of dn ’ (so that it would be the 
outer normal for conductors, rather than free space volume V), dividing all terms by 4k, and partitioning 


65 See, e.g., MA Eq. (12.3). Actually, this theorem is a ready corollary of the divergence theorem, MA Eq. (12.2). 

66 This symmetry, virtually evident from Eq. (204), may be formally proved by applying Eq. (207) to functions / 
(r) = G(r, r ’) and g(r) = G(r, r”). With this substitution, the left-hand part becomes equal to -4n[G(r”, r j - G(r 

r ”)], while the right-hand part is zero, due to Eq. (205). 
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the total surface A into the parts (numbered by index j) corresponding to different conductors (possibly, 
kept at different potentials tf>k), we finally arrive at the famous result: 67 

Potential 

(2.210) via Green’s 
function 


While the first tenn in the right-hand part of this relation is a direct and evident expression of the 
superposition principle, given by Eq. (203), the second term is highly non-trivial: it describes the effect 
of conductors with nonvanishing potentials fa (Fig. 28b) using the Green’s function calculated for the 
similar system with grounded conductors, i.e. with all fa = 0 (Fig. 28a). Fet me emphasize that since our 
volume V excludes conductors, the first term in the right-hand part of Eq. (210) includes only the “free- 
standing” charges of the system (in Fig. 28, marked q i, q 2 , etc.), but not the surface charges of the 
conductors - which are taken into account, implicitly, by the second tenn. 

In order to illustrate what a powerful tool Eq. (210) is, let us use to calculate the electrostatic 
field in two systems. In the first of them, a circular disk, separated with a very thin cut from a 
conducting plane, is biased with potential (/>=V, while the rest of the plane is grounded - see Fig. 29. 




Fig. 2.29. Voltage-biased conducting circle inside a grounded conducting plane. 


If the width of the gap between the circle and rest of the plane is negligible, we may apply Eq. 
(210) with p { r ’) = 0, and the Green’s function for the uncut plane - see Eq. (206). 68 In the cylindrical 
coordinates, the function may be rewritten as 


G(r,r') = 


ip~ + p'~ - 2pp' COS((p - (p') + (z - z'y ) 1 - [p 2 + p' 1 - 2 pp' COS(^9 - (p') + (z + z') 2 J 12 


. (2.211) 


(The sum of the first three terms under the square roots of Eq. (211) is just the squared distance between 
the horizontal projections p and p ’ of vectors r and r ’ (or r ”), correspondingly, while the last terms are 
the squares of their vertical spacings.) 

Now we can readily calculate the necessary derivative: 


67 In some textbooks, the sign before the surface integral is negative, because their authors use the outer normal of 
the free-space region V rather than that occupied by conductors - as I do. 

68 Indeed, if all parts of the cut plane are grounded, a narrow cut does not change the field distribution, and hence 
the Green’s function, significantly. 
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8G_ | 
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, | z '=+0 


( p 2 + p' 2 -2pp' cos((p-(p') + z 2 ) 1 


( 2 . 212 ) 


Due the axial symmetry of the system, we can take cp for zero. With this, Eqs. (210) and (212) yield 


(f> = — <J> 
4 n { 
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2 n 


2 71 
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p 1 + p' 1 -2 pp ' cos cp' + z 2 ) 


2 13/2 


(2.213) 


This integral is not too pleasing, but may be readily worked out for points on the symmetry axis ( p = 0): 
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This expression shows that if z — » 0, the potential tends to V (as it should), while at z » R, 


(j)^V 


R_ 

2 z 2 


(2.214) 


(2.215) 


This asymptotic behavior is typical for electric dipoles - see the next chapter. 

Now, let us use the same Eq. (210) to solve the (in :-)famous problem of the cut sphere (Fig. 30). 
Again, if the gap between the two conducting semi-spheres is very thin ( t « R), we may use the 
Green’s function for the grounded (and uncut) sphere. For a particular case r’ = dn : , this function is 
given by Eqs. ( 1 97)-( 1 98); generalizing the fonner relation for an arbitrary direction of vector r ’, we get 


G = 


1 


R/r' 


(r 2 +r' 2 - 2/r'cosy)' (r 2 +(R 2 / r') 2 - 2r(R 2 /r')cosyj 
where y is the angle between vectors r and r ’, and hence r ” (Fig. 30). 


, 1/2 


for r,r' > R. . 


(2.216) 



Fig. 2.30. A system of two, oppositely biased semi- 
spheres. 


Now, finding the Green’s function’s derivative, 


0 G, 

a ? 


(. r 2 -R 2 ) 


r'=i?+0 


f?[r^ + R 2 -2Rrcosy 


(2.217) 


and plugging it into Eq. (210), we see that the integration is easy only for the field on the symmetry axis 
(r = m z , 7 = 6), giving 
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</> = - 
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1 -- 


z 2 - R 2 


z{z 2 +R 2 ) 

For z — ^ R, ^ » V/2 (just checking while for z » R , 

3 R 2 


, 1/2 


</>^V 


4z 2 


so this is also an electric dipole field - see the next chapter. 


(2.218) 


(2.219) 


2.8. Numerical methods 

Despite the richness of analytical methods, for many boundary problems (especially in 
geometries without high degree of symmetry), numerical methods is the only way to the solution. 
Despite the current abundance of software codes and packages offering their automatic numerical 
solution, 69 it is important to an educated physicist to understand “what is under their hood”, at least 
because most universal programs exhibit mediocre performance in comparison with custom codes 
written for particular problems, and sometimes do not converge at all, especially for fast-changing (say, 
exponential) functions. 

The simplest of the numerical methods of solution of partial differential equations is the finite- 
difference method 70 in which the sought function of N scalar arguments ffi, r 2 ,.. .r N ) is represented by 
its values in discrete points of a rectangular grid (also called mesh ) of the corresponding dimensionality 
(Fig. 31). 


(a) 


(b) 
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Fig. 2.31. General idea of the finite-difference method in (a) one, (b) two, and (c) three dimensions. 


Each partial second derivative of the function is approximated by the formula that readily 
follows from the linear approximations for the function/ and then its partial derivatives - see Fig. 31a: 
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(2.220) 


69 See, for example, MA Secs. 16 (iii) and (iv). 

70 For more details see, e.g., R. Leveque, Finite Difference Methods for Ordinary and Partial Differential 
Equations, SIAM, 2007. 
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where f + =fljj + A) and where /j_ =f[ij - A). (The relative error of this approximation is of the order of 
h 4 d 4 f/drf.) As a result, a 2D Laplace operator may be presented as 


d 2 f 8 2 f = ~ 2/ /t + fj ~ 2/ = /_> + /<- + /t + fj ~ 4 / 

3x 2 fiy 2 A 2 A 2 A 2 


( 2 . 221 ) 


while the 3D operator as 

d 2 / d 2 / r7 /. /< ‘./-A/; •/- / -h/ 

fix 2 fiy 2 fiz 2 A 2 


( 2 . 222 ) 


(The notation used in these formulas should be clear from Figs. 3 lb and 31c, respectively.) 

Let us apply this scheme to find the electrostatic potential distribution inside of a cylindrical box 
with conducting walls and square cross-section, using an extremely coarse mesh with step A = all (Fig. 
32). In this case our function, the electrostatic potential, equals zero on the side walls and the bottom, 
and equals to Vo at the top lid, so that, according to Eq. (221), the Laplace equation may be 
approximated as 


O + O + Fq+O-4^ 
(a/2) 2 


(2.223) 


The resulting value for the potential in the center of the box is </)= Fo/4. Surprisingly, this is the exact 
value! This may be proved by solving this problem by the variable separation method, just as this has 
been done for the similar 3D problem in Sec. 4 above. The result is 


<f>(x,y) = Y; c n sin 

n = 1 


mix . , 

sinn 

a 


my 

? 

a 


4L 0 J 1, if n is odd, 
msmh(m) [0, otherwise. 


(2.224) 


so that at the central point (x=y = a/2), 


<t> = 


n J= o 


sin n(2 j + 1 ) / 2]sinh[zr(2 / + 1 ) / 2] 
(2j + 1) sinh[;r(2y + 1)] 


2v 0 f (~iy 

n j^,(2j + 1) cosh[;r(2y + l)/2] 


(2.225) 


The last series equals exactly to id 8, so that (f> = Fo/4. 



Fig. 2.32. Numerical solution of the internal 2D boundary 
problem for a conducting, cylindrical box with square cross- 
section, using a very coarse mesh (with A = a/2). 


For a similar 3D problem (a cubic box) we can use Eq. (222) to get 
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0 + 0 + K, +0 + 0 + 0-6t> 

(a/2) 2 

so that (j) = Vo/6. Unbelievably enough, this result is also exact! (This follows from our variable 
separation result expressed by Eqs. (95) and (99).) 

Though such exact results should be considered as a happy coincidence rather than the norm, 
they still show that numerical methods, with a relatively crude mesh, may be more computationally 
efficient than the “analytical” approaches, like the variable separation method with its infinite-series 
results that, in most cases, require computers anyway for the result comprehension and analysis. 

A more powerful (but also much more complex for implementation) approach is the finite- 
element method in which the discrete point mesh, typically with triangular cells, is (automatically) 
generated in accordance with the system geometry. Such mesh generators provide higher point 
concentration near sharp convex parts of conductor surfaces, where the field concentrates and hence the 
potential changes faster, and thus ensure better accuracy-to-performance trade-off than the finite- 
difference methods on a unifonn grid. The price to pay for this improvement is the algorithm complexity 
that makes manual adjustments much harder. Unfortunately I do not have time for going into the details 
of that method, and have to refer the reader to the special literature on this subject. 71 


2.9. Exercise problems 

2.1 . Calculate the force (per unit area) exerted on a conducting surface by an electric field. 
Compare the result with the definition of the electric field, given by Eq. (1.5). 


2.2 . A thin plane film, carrying a uniform electric charge density 
cr, is placed inside a plane capacitor whose plates are connected by a 
wire - see Fig. on the right. Neglecting the edge effects, calculate the 
surface charges of the plates, and the net force acting on the film (per 
unit area). 



2.3 . Following the discussion of two weakly coupled spheres in Sec. 2, find an approximate 
expression for the mutual capacitance (per unit length) between two very thin, parallel wires, both with 
a round cross-section, but each with its own diameter. Compare the result with that for two small 
spheres, and interpret the difference. 

2.4 . Use the Gauss law to calculate the mutual capacitance of the 
following 2-electrode systems, with the cross-section shown in Fig. 5 
(reproduced on the right): 

(i) a conducting sphere inside a concentric spherical cavity in another 
conductor, and 

(ii) a conducting cylinder inside a coaxial cavity in another conductor. 

(In this case, we speak about the capacitance per unit length). 



71 See, e.g., C. Johnson, Numerical Solution of Partial Differential Equations by the Finite Element Method, 
Dover, 2009, or T. J. R. Hughes, The Finite Element Method, Dover, 2000. 
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Compare the results with those obtained in Sec. 2.2, using the Laplace equation solution. 


2.5 . Calculate the electrostatic potential distribution around two barely 
separated conductors in the form of coaxial, round cones (see Fig. on the right), 
with voltage V between them. Compare the result with that of a similar 2D 
problem, with the cones replaced by plane-face wedges. Can you calculate the 
mutual capacitance between the conductors in any of these systems? If not, can 
you estimate it? 



2.6 . A system of two thin conducting plates is located over a 
ground plane as shown in Fig. on the right, where A ’ and A ” are plate 
part areas, while d’ and d” are distances between them. Neglecting 
the fringe effects, calculate: 

(i) the effective capacitance of each plate, and 

(ii) their mutual capacitance. 


1 

2 






2.7 . Using the results for a single thin round disk, obtained in Sec. 4, 
consider a system of two such disks at a small distance d « R from each other - 
see Fig. on the right. In particular, calculate: 

(i) the reciprocal capacitance matrix of the system, 

(ii) the mutual capacitance between the disks, 

(iii) the partial capacitance, and 

(iv) the effective capacitance of one disk, 



(all in the first non-vanishing approximations in d/R « 1). Compare the results (ii)-(iv) and interpret 
their similarities and differences. 


2.8 .* Calculate the mutual capacitance (per unit length) between two 
cylindrical conductors fonning a system with the cross-section shown in 
Fig. on the right, in the limit t « w « R. 


Hint : You may like to use elliptical (not “ellipsoidal”!) coordinates 
{a, p\ defined by the following equation: 

x + iy = ccosh(a + //?), (*) 

with the appropriate choice of constant c. In these orthogonal 2D 
coordinates, the Laplace operator is very simple: 72 


V 2 =■ 


1 

c 2 (cosh 2 a - cos 2 p) 


8 ' 


- + - 


d 


2 A 


da z dp 1 



72 This fact should not be surprising, because Eq. (*) is essentially the conformal map a, = c cosh in, where « = x + 
iy, and ta = a + ip~ see the discussion in Sec. 4. 
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2.9 . Formulate 2D electrostatic problems that can be solved using each of the following analytic 
functions of the complex variable -z = x + iv: 

(i) ut = In «, 

(ii) ut = ^ 1/2 , 

and solve these problems. 


2.10 . On each wall of a cylindrical volume with a rectangular 
cross-section axb, with no electric charges inside it, the electric field is 
uniform, normal to the wall plane, and opposite to that on opposite side 
- see Fig. on the right. Calculate the distribution of the electric potential 
inside the volume, provided that the field magnitude on the vertical 
walls equals E. 


.LLLLLLt.l, 


r , — -- 

. __ 

i — ► 

— 1 

| * b 

♦ — 1 

v - * a 

-V . 

l V 

1 *■ ' 

/> 1 


TTTTTTTf 


2.1 1 . Complete the solution of the problem shown in Fig. 10, by calculating the distribution of 
the surface charge of the semi-planes. Can you calculate the mutual capacitance between the plates (per 
unit length)? If not, can you estimate it? 


2.12 .* A straight, long, thin, round-cylindrical metallic pipe has been 
cut, along its axis, into two equal parts - see Fig. on the right. 


(i) Use the conformal mapping method to calculate the distribution of 
the electrostatic potential, created by voltage V applied between the two parts, 
both outside and inside the pipe, and of the surface charge. 

(ii) Calculate the mutual capacitance between pipe’s halves (per unit 
length), taking into account a small width 2 1 « R of the cut. 

Hints'. In Task (i), you may like to use the following complex 
function: 


ut = In 


R + 'Z 
R — a. 


+ V/2 



while in Task (ii), it is advisable to use the solution of the previous problem. 


2.13 . Solve Task (i) of the previous problem using the variable separation method, and compare 
the results. 


2.14 . Use the variable separation method to calculate the potential distribution above the plane 
surface of a conductor, with a strip of width w separated by very thin cuts, and biased with voltage V - 
see Fig. below. 


<f> = 0 


w 

< > 


<j) = V 


(j) = 0 
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2.15 . In the Fig. of the previous problem, the cut-out and voltage-biased part of the conducting 
plane is now not a strip, but a square with side w. Calculate the potential distribution above conductor’s 
surface. 


2.16 . Complete the cylinder problem started in Sec. 5 (see Fig. 15), for the cases when voltage 
on the top lid is fixed as follows: 


(i) V= VoJ\{^uplR)sm(p, where « 3.832 is the first root of function J\(x), and 

(ii) V=Vo= const. 


For both cases, calculate the electric field in the centers of the 
assignment (ii), an answer including series and/or integrals is satisfactory.) 


lower and upper lids. (For 


2.17 . Each electrode of a large plane capacitor is cut into 
long strips of equal width /, with very narrow gaps between them. 
These strips are kept at the alternating potentials as shown in Fig. 
on the right. Use the variable separation method to calculate the 
electrostatic potential distribution. Explore the limit I « d. 



2.18 . Solve the problem shown in Fig. 19. In particular: 

(i) calculate and sketch the distribution of the electrostatic potential inside the system for various 
values of ratio R/h, and 

(ii) simplify the results for the limit R/h — » 0. 

2.19 . Use the variable separation method to find the potential distribution inside and outside of a 
thin spherical shell of radius R , with fixed potential <jfR,0,(p) = Vo sin# cos#?. 

2.20 . A thin spherical shell carries charge with areal density cr = 
distribution of the electrostatic potential and field. 

2.21 . Use the variable separation method to calculate the potential 
distribution both inside and outside of a thin spherical shell of radius R, 
separated with a very thin cut, along plane z = 0, into two halves, with 
voltage V applied between them - see Fig. on the right. Analyze the 
solution; in particular, compare the field at axis z, for z > R, with Eq. 

(2.218), obtained by the Green’s function method. 

Hint: You may like to use the following integral of a Legendre 
polynomial with odd index / = 1, 3, 5...= 2 n- l: 73 


73 As a reminder, the double factorial (also called “semifactorial”) operator (!!) is similar to the usual factorial 
operator (!), but with the product limited to numbers of the same parity as its argument (in our particular case, of 
the odd numbers in the nominator, and even numbers in the denominator). 


oocos# Calculate the spatial 

z 
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i n\ 


61 k 


w 


f 1\ 


\ * ) 




V ^ J 


— n 


^(- 1 ) 


„-i (2/7-3)!! 
2n(ln — 2 )! ! 


2.22 . A small conductor (in this context, usually called the single-electron 
box or single-electron island) is placed between two conducting electrodes, with 
voltage V applied between them. The gap between one of the electrodes and the 
box is so narrow that electrons may tunnel quantum-mechanic ally through this gap 
(“weak tunnel junction”) - see Fig. on the right. Neglecting thermal fluctuations, 
calculate the equilibrium charge of the island as a function of V. 

Hint : To solve this problem, you do not need to know much about 
quantum-mechanical tunneling through weak junctions, 74 besides that such 
tunneling of an electron, and its subsequent energy relaxation inside the conductor, 
may be considered as a single inelastic (energy-dissipating) event. In the absence of thennal agitation, 
such event takes place when (and only when) it decreases the potential energy of the system. 



2.23 . Use the image charge method to calculate the surface charges induced in the plates of a 
very broad plane capacitor of thickness D by a point charge q separated from one of the electrodes by 
distance d. 

2.24 . Prove the statement, made in Sec. 6, that the 2D boundary problem 
shown in Fig. on the right can be solved using a finite number of image charges if 
angle /? equals jdn, where n = 1 , 2 ,... 


2.25 . Use the image charge method to calculate the energy of electrostatic interaction between a 
point charge placed in the center of a spherical cavity that was carved inside a grounded conductor, and 
the conductor’s walls. Looking at the result, could it be obtained in a simpler way (or ways)? 



2.26 . Use the method of images to find the Green’s 
function of the system shown in Fig. on the right, where the 
bulge on the conducting plane has the shape of a semi-sphere of 
radius R. 



2.27 .* Use the fact of spherical inversion, expressed by Eq. (198), to develop an iterative method 
for more and more precise calculation of the mutual capacitance between two similar metallic spheres of 
radius R , with centers separated by distance d > 2 R. 


74 In this context, weak junction means a tunnel junction with transparency so low that the tunneling electron’s 
wavefunction looses its quantum-mechanical coherence before the electron has time to tunnel back. In a typical 
junction of a macroscopic area this condition is fulfilled if the effective tunnel resistance of the junction is much 
higher than the quantum unit of resistance (see, e.g., QM Sec. 3.2) , Rq = nhile 1 ~ 6.5 kQ. 
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2.28 / A metallic sphere of radius R\, carrying electric charge Q, is 
placed inside a spherical cavity of radius R 2 > R\, cut inside another metal. 
Calculate the force exerted on the sphere if is center is displaced by a small 
distance 8« R\, R 2 - Ri from that of the cavity - see Fig. on the right. 


2.29 . Within the simple model of electric field screening in conductors, discussed in Sec. 2.1, 
analyze the partial screening of the electric field of a point charge q by a plane, uniform conducting film 
of thickness t « /l, where A is (depending on charge carrier statistics) either the Debye or the Thomas- 
Fenni screening length - see, respectively, Eqs. (2.8) or (2.10). Assume that the distance d between the 
charge and plane is much larger than t. 

2.30 . Suggest a convenient definition of 2D Green’s function for electrostatic problems, and 
calculate it for: 

(i) the unlimited free space, and 

(ii) the free space above a conducting plane. 

Use the latter result to re-solve Problem 14. 



2.31 . Find the 2D Green’s function for the free space 

(i) outside a round conducting cylinder, 

(ii) inside a round cylindrical hole in a conductor. 


2.32 . Solve Task (i) of Problem 12 (see also Problem 13), using the Green’s function method. 


Hints : You may like to use the 2D Green’s function derived in the solution of Problem 2.27(h), 
and the following table integral: 75 


f 2 

Jo + bcos£ ( a 2 -b 2 f 2 


tan 


(< a - b ) 
[a 2 -b 2 J ' 2 



if a 2 -b" > 0 . 


2.33 . Solve the same 2D boundary problem that was discussed in Sec. 6 (Fig. 32) using: 

(i) the finite difference method, with a finer square mesh, A = a/3, and 

(ii) the variable separation method. 

Compare the results (at the mesh points only) and comment. 


75 Here the notation tan' 1 is used for the multi-valued function (alternatively called Arctan) which is reciprocal to 
tan. (Due to the ^"-periodicity of the tan, function tan' 1 is defined to an arbitrary additive multiple of k.) At the 
value interval \-ni 2, +/r/2] , tan' 1 is usually called arctan. 
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Chapter 3. Polarization of Dielectrics 

In the last chapter, we have discussed the electric polarization of conductors. In contrast to those 
materials, in dielectrics the charge motion is limited to the interior of an atom or a molecule, so that the 
electric polarization of these materials by external field takes a different form. This issue is the main 
subject of this chapter. In preparation to the analysis of dielectrics, we have to start with a more general 
discussion of the electric field of a spatially-restricted system of charges. 


3.1. Electric dipole 

Let us consider a localized system of charges, of a linear size scale a, and calculate a simple but 
approximate expression for the electrostatic field induced by the system at a distant point r. For that, let 
us select a reference frame with the origin either somewhere inside the system, or at a distance of the 
order of a from it (Fig. 1). 



Then positions of all charges of the system satisfy the following condition 

r' « r . 


(3-1) 


Using this condition, we can expand the general expression (1.38) for the electrostatic potential (j{ r) of 
the system into the Taylor series in small parameter r’ = {r\, r\, r’ 3}. For any spatial function of the 
type/(r - r ’), the expansion may be presented as 1 


/(r-r')« /(r)-Yr'— (r) + — Y rr, - — (r)-.... 

P J dr 2! p- 3 3 dr dr , 


(3.2) 


The two leading terms of this expansion, sufficient for our current purposes, may be rewritten in the 
vector form: 2 


/(r - r') « /(r) - r' • V/ (r) + ... . (3.3) 

Let us apply this approximate formula to the free-space Green’s function (2.204), which weighs the 
charge density contributions in Eq. (1.38). The gradient of such a spherically-symmetric function J{r) = 
Mr is just n ,dfdr, so that 


1 See, e.g., MA Eq. (2.1 lb). 

2 The third term (responsible for quadrupole effects), as well as all the following, multipole terms would require a 
tensor (rather then vector) representation. 
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Plugging this dipole expansion into Eq. (1.38), we get 

1 

4 ns n r ■ 


^(r) : 


- — Jp(r')rfV + -^- Jj0(r')r'cfV 
o L r r 


4 KS n 


Q r P 1 

-+-r 

V r r J 


where Q is the net charge of the system, while the vector 


P = \ P( r ') r 'd 3 r' , 


with magnitude p of the order of Qa, is called its (electric) dipole moment? 


(3.4) 


(3.5) 


Electric 

(3.6) di P° le 

moment 


If Q ^ 0, the second tenn in the right-hand part of Eq. (5) is just a small correction to the first 
one, and in many cases may be ignored. (Remember, Eq. (5) is only valid in the limit rla — » oo). 
However, the net charge of many systems is exactly zero. The most important example is a neutral atom 
or a neutral molecule, in which the negative charge of electrons exactly compensates the positive charge 
of protons in nuclei. For such neural systems, the second (dipole -moment) term, fa, in Eq. (5) is the 
leading one. Due to its importance, let us rewrite this expression in two other, equivalent forms: 

Electric 
(3.7) dipole’s 
potential 


that are more convenient for some applications. Here 6 is the angle between vectors p and r, and in the 
last (Cartesian) presentation, axis z is directed along vector p. Figure 2a shows equipotential surfaces of 
the dipole field (or rather their cross-sections by a plane in which vector p resides). 


. 1 rp 1 pcosO 1 

pz 

d Ax£ n r 3 4 7T£ 0 r 2 4 7t£ a 

x 2 +r+z\ 

3/2 ’ 



Fig. 3.2. Dipole field: (a) equipotential surfaces and (b) electric field lines, for vertical vector p. 


3 Accordingly, a localized system of charges with Q = 0, but p ^ 0, is called an (electric) dipole. 
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The simplest example of the dipole (that gave such systems their name) is a system of two equal 
but opposite point charges, +q and -q, with radius-vectors, respectively, r+ and r.: 

p { r ) = (+?)<?(r - r + ) + (~q)S { r -r ) . (3.8) 

For this system, Eq. (6) yields 

P = (+q) r + + H0r_ = q( r + - O = qa , (3.9) 

where a is the vector connecting points r. and r+. Note that in this case (and for all systems with Q = 0), 
the dipole moment does not depend on the reference frame origin choice. 

A less trivial example is a conducting sphere of radius R in a uniform external electric field E 0 . 
As a reminder, we have solved this problem in Sec. 2.5(iv) and obtained Eq. (2. 176) as a result. The first 
tenn in the parentheses of that relation describes the external field (2.173), so that the field of the sphere 
itself (meaning the field of its surface charge induced by Eo) is given by the second term: 

F R 3 

<f> s = 0 cos 6 . (3.10) 

r 

Comparing this expression with the second fonn of Eq. (7), we see that the sphere has an induced dipole 
moment 

p = 4^ 0 E 0 7? 3 . (3.11) 


This is an interesting example of a purely dipole field - in all points outside the sphere (r > R), the field 
has no higher moments. 4 


Returning to the general properties of the dipole field, let us calculate its characteristics. First of 
all, we may use Eq. (7) to calculate the electric field of a dipole: 


E, =-V&=- 


1 v 

( r p^ 

__ 1 v 

( p cos 6* 3 

4 7TS 0 

l r 3 ) 

4ne Q 

2 

^ r ) 


(3.12) 


The differentiation is easiest in spherical coordinates, using the following well-known expression for the 
gradient of a scalar function in these coordinates 5 and taking axis z parallel to vector p. From the last 
form of Eq. (12) we immediately get 

Electric 
dipole’s 
field 


Figure 2b shows the electric field lines given by Eqs. (13). 

Next, let us calculate the potential energy of interaction between a fixed dipole and a external 
electric field, using Eq. (1.54). Assuming that the external field does not change much at distances of the 
order of a (Fig. 1), we may expand the external potential <j) ex t (r) into the Taylor series, just as Eq. (3) 
prescribes, and keep only its two leading terms: 



(3.13) 


4 Other examples of dipole fields are given by two more systems discussed in Chapter 2 - see Eqs. (2.215) and 
(2.219). Those systems, however, do have higher-order multipole moments, so that for them, Eq. (7) gives only 
the long-distance approximation. 

5 See, e.g., MA Eq. (10.8) with d!d<p= 0. 
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t' = Jp(r¥ m (r)rfV»Jp(r)[ ! !l„ 1 (0) + rV ! !l„ 1 (0)]dV = e < (,„(0)-p-E„ l . (3.14) 


The first term is the potential energy the system would have if it were a point charge. If the net charge Q 
is zero, that term disappears, and the leading contribution is due to the dipole moment: 


U = ~ P-E ext 


(3.15) 


Note, however, that Eq. (15) is only valid for a fixed dipole, with p independent of E ext . In the opposite 
limit, when the dipole is induced by the field, i.e. p oc E ext (see Eq. (1 1) as an example), we can repeat 
the discussion that accompanied Fig. 1.6 to show that Eq. (15) acquires an additional factor Vi. 


Dipole’s 
energy in 
external 
field 


In particular, combining Eqs. (13) and Eq. (15), we may get the following important formula for 
interaction of two independent dipoles 


1 p, p 2 r 2 -3(r Pl )(r p 2 ) = 1 PuPi, + PuPiy - 2 Pi : Pi : 

4ns Q r 5 4 ks 0 r 3 


(3.16) 


where r is the vector connecting the dipoles, and axis z is directed along this vector. If each moment is 
due to the polarization of the dipole by the electric field of its counterpart: pi j2 °c E 2 ,i oc I//’, this 
expression (which is valid for this case with the additional factor Vi) the potential is always negative and 
proportional to Mr' . Such potential describes, in particular, the long-range, attractive part (the so-called 
London dispersion force ) of the interaction between electrically neutral atoms and molecules. 6 

According to Eq. (15), in order to reach the minimum of U, the electric field “tries” to align the 
dipole direction along its own. The quantitative expression of this effect is the torque x exerted by the 
field. The simplest way to calculate it is to sum up all the elementary torques dx = rxdF ext = 
rxE cx i(r)p(r)c/V exerted on all elementary charges of the system: 

^ = { r x E ext (r)p(r) J 3 r « p x E ex t (0) , (3.17) 

where at the last transition we have again neglected the spatial dependence of the external field. 

The spatial dependence of E ext cannot, however, be ignored at the calculation of the total force 
exerted by the field on the dipole (with Q = 0). Indeed, Eq. (15) shows that if the field is constant, the 
dipole energy is the same at all spatial points, and hence the net force is zero. However, if the field has a 
finite gradient, a total force does appear: 

F = -VU = V(p -E ext ) , (3.18) 


where the derivative has to be taken at the dipole’s position (in our notation, at r = 0). If the dipole that 
is being moved in a field retains its magnitude and orientation, then the last formula is equivalent to 7 

F = (p-V)E„,. (3.19) 

Alternatively, the last expression may be obtained similarly to Eq. (14): 


F = J p(r)E ext (r)d 3 r * J p(r)[E ext (o)+ (r • V)E ext ] d 3 r = QE ext (0) + (p • V)E ext . (3.20) 


6 See, e.g., SM Sec. 3.5. 

7 The equivalence may be proved, for example, by using MA Eq. (1 1.6) with f = p = const and g = E ext , taking 
into account that according to the general Eq. (1.28), VxE ext = 0. 
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Finally, let me add a note on the so-called coarse-grain model of the dipole. The dipole 
approximation explored above is asymptotically correct at large distances, r » a. However, for some 
applications (including the forthcoming discussion in Sec. 5 of molecular field effects) it is important to 
have an expression that would be approximately valid everywhere in space, though maybe without exact 
details at r ~ a, and also give the correct result for the space-average of the electric field, 


E = 


— [ Et/V , 

yJ 

V V 


(3.21) 


3 

where V is a regularly-shaped volume much larger than a , for example a sphere of radius R » a, with 
the dipole at its center. For the field E </ given by Eq. (13), such average is zero. Indeed, let us consider 
Cartesian components of that vector in the coordinate system with axis z directed along vector p. Due to 
the axial symmetry of the field, the averages of components E x and E v evidently vanish. Let us use Eq. 
(13) to spell out the “vertical” component of the field (parallel to the dipole moment vector): 


E. = E . 21 = 


4 7t£ Q r' 


-(2n,. -pcos^-n^ • p sin 6>) = — ^-^-(2cos' #-sin 2 O). (3.22) 


An£ 0 r 


Integrating this expression over the whole solid angle Q = An, at fixed r, using a convenient variable 
substitution cos 6 = we get 


n 71 +1 

lE_dCl = 2jc\E z smGdd = — L V —\(lcos 2 6 -sm 2 6)smdd6 = —^ \(?>f -^W = 0. (3.23) 

J J 9 c r J 9 o r J 
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2s a r 


On the other hand, the exact electric field of an arbitrary charge distribution, having the total 
dipole moment p, satisfies the following condition, 


|E(r)jV 



(3.24) 


where the integration is over any sphere containing all the charges. A proof of this fonnula for the 
general case requires a somewhat cumbersome, though straightforward integration, 8 but later in the 
course we will see that it is correct for several particular cases. The origin of the difference between Eqs. 
(23) and (24) is illustrated in Fig. 3 on the example of a dipole created by two equal but opposite 
charges - see Eqs. (8)-(9). The zero average of the dipole field (13) does not take into account the 
contribution of the field in the region between the charges (where Eq. (13) is not valid), which is 
directed mostly against the dipole vector (9). 

Thus in order to be used as a reasonable coarse-grain model, Eq. (13) should be modified as 
follows: 


E 


Cg 


1 


4 ns n 


3r(r p) -pr 2 



(3.25) 


Evidently, such modification does not change the field at large distances r » a, i.e. in the region where 
the expansion (3) and hence Eq. (13) are valid. 


8 See, e.g., the end of Sec. 4. 1 in the textbook by J. Jackson, Classical Electrodynamics, 3 rd ed., Wiley, 1999. 
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Let us generalize equation (7) to the case of several (possibly, many) dipoles p ; located at 
arbitrary points ly. Using the linear superposition principle, we get 

(*< r ) = 7T-Ip P- 2 «) 

H/Lb 0 j r_r / 

If our system (medium) contains many similar dipoles, distributed in space with density n{ r), we may 
use the same standard argumentation that has led us from Eq. (1.5) to Eq. (1.8), to rewrite the last sum as 
an integral 

m = Jp(r') , (3.27) 

r — r 


where vector P(r) = n(r)p, called electric polarization has the physical meaning of the net dipole 
moment per unit volume. Note again that since Eq. (26) does not describe that field at distances 
comparable to the dipole size, and hence Eq. (27), and all the following formulas of this section, 
describes the so-called macroscopic electric field, i.e. the dipole field averaged over the microscopic 
(dipole-size) distances. 

Now comes a very impressive mathematical trick. Just as has been done in the previous section 
(just with the appropriate sign change), Eq. (27) may be rewritten in the equivalent form 

0( r ) = — — — f P(r') • V' | — ^——d 3 r' , (3.28) 

4 7TS 0 J |r-r| 


where V ’ means the del operator (in this particular case, the gradient) acting in the “source space” of 
vectors r The right-hand part of Eq. (28), applied to any volume V limited by surface S, may be 
integrated by parts in the following way: 9 


</>(r) 



— f v,,p(r 'Ly. 

Ki | r “ r 1 


(3.29) 


9 To prove this (almost evident) formula strictly, it is sufficient to apply the divergence theorem given by MA Eq. 
(12.2), to vector function f = P(r ’)/|r - r ’|, in the “source space” of radius-vectors r ’. 
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Effective 

charge 

density 


If the surface does not carry an infinitely dense (^-functional) sheet of additional dipoles, or it is 
just very far, the first term in the right-hand part is negligible. Now comparing the second term with the 
basic equation (1.38) for the electric potential, we see that this term may be interpreted as the field of 
certain effective electric charges with density 


p* =-v P- 


(3.30) 


Figure 4 illustrates the physics of this relation for a cartoon model of a multi-dipole system: a 
layer of uniformly-distributed two-point-charge units oriented perpendicular to the layer surface. (In this 
case V P = dP/dx.) One can see that p e f, defined by Eq. (30), may be interpreted as the density of 
uncompensated surface charges of polarized elementary dipoles. 



Fig. 3.4. Spatial distributions of the 
polarization and effective charges in a layer of 
similar elementary dipoles (schematically). 


Next, from Sec. 1.2, we already know that Eq. (1.38) is equivalent to the inhomogeneous 
Maxwell equation (1.27) for the electric field. This is why Eq. (30) implies that if, besides the 
compensated charges of the dipoles, the system also has certain stand-alone charges (not a part of the 
dipoles!) distributed in space with density p(r), the average electric field obeys, instead of Eq. (1.27), 
the following generalized equation 

V-E=— (p + p c ,)= — (p-V-P). (3.31) 

£ 0 £ 0 


Electric 

displacement 


It is evidently tempting (and very convenient for applications!) to carry over the dipole-related term of 
this equation over to the left-hand part of Eq. (31), and rewrite the resulting equality as the so-called 
macroscopic Maxwell equation 


V-D = p. 


(3.32) 


where a new vector, called the electric displacement, is defined as 10 


10 Note that the dimensionality of D in SI units is different from that of E. In contrast, in the Gaussian units the 
electric displacement is defined as D = E + 4nP, so that V D = 4 np (the relation /? ef = -V P remains the same as in 
SI units), and the dimensionalities of D and E coincide. Philosophically, this coincidence is a certain handicap, 
because it is frequently convenient to consider Cartesian components of E as a generalized force, and those of D 
as a generalized coordinate (see Sec. 6 below), and it is comforting to have their dimensionality different. 
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D = £ 0 E + P . 


(3.33) 


The comparison of Eqs. (32) and (1.27) shows that D may be interpreted as the “would-be” 
electric field that would be created by stand-alone charges in the absence of the dipole medium 
polarization. In contrast, E is the actual electric field - though, as was mentioned above, space-averaged 
over a volume much larger that of an elementary dipole. 11 


To complete the general analysis of the multi-dipole systems, let us rewrite the macroscopic 
Maxwell equation (32) in the integral form. Applying the divergence theorem to an arbitrary volume V 
limited by surface S, we get the following macroscopic Gauss law. 



(3.34) 


where Q is the total stand-alone charge inside volume V. 

Let me emphasize again that the key Eq. (27), and hence all the following equations of this 
section, only to the macroscopic field, i.e. the electric field averaged over its rapid variations at the 
atomic space scale. Such macroscopic description is valid as soon as we are not concerned with the 
inter-atomic field variations - for whose description the classical physics is inadequate in any case. 


3.3. Linear dielectrics 


The general equations derived above are broadly used to describe any dielectrics - materials with 
bound electric charges (and hence with no dc electric conduction). The polarization properties of these 
materials may be described by the dependence between vectors P and E. In the most materials, in the 
absence of external electric field, the elementary dipoles p either equal zero or have a random 
orientation in space, so that the net dipole moment of each macroscopic volume (still containing many 
such dipoles) equals zero: P = 0. 


Moreover, if the field changes are sufficiently slow, most materials may be characterized by a 
unique dependence of P on E. Then using the Taylor expansion of function P(E), we may argue that in 
relatively low electric fields the function should be well approximated by a linear dependence between 
these two vectors. In an isotropic media, the coefficient of proportionality should be just a scalar. 12 In 
SI units, this constant is defined by the following relation: 




SqE, . 


(3.35) 


with the dimensionless constant % e called the electric susceptibility. However, it is much more common 
to use, instead of % e , another parameter, 


^ = 1 + 2T e , 


(3.36) 


11 Note, however, that such averaging does not include the inner-dipole fields which is (approximately) described 
by the second term of Eq. (25). 

12 In anisotropic materials, such as crystals, a susceptibility tensor may be necessary to give an adequate 
description of the linear relation of vectors P and E. Fortunately, in most important crystals (such as silicon) the 
anisotropy of polarization is small, so that they may be reasonably well characterized by scalar susceptibility. 


Macroscopic 
Gauss law 


Electric 

susceptibility 

definition 


Dielectric 

constant 
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which is sometimes called the relative electric permittivity, but much more often, the dielectric 
constant. 13 This parameter is very convenient, because combining Eqs. (35) and (36), 

P = (s r -1> () E. (3.37) 


and then plugging the resulting relation into the general Eq. (33), we get simply 14 

Electric 
permittivity 


D = sE, with s = £ 0 £ r = £ 0 (1 + z e )- 


(3.38) 


where s is called the electric permittivity of the material. Table 1 gives values of the dielectric constant 
for several representative materials. 


Table 3.1. Di electric constants of a few representative (and/or practically important) dielectrics 


Material 

£ r 

Air (at ambient conditions) 

1.00054 

Teflon (polytetrafluoroethylene, C„F 2 n) 

2.1 

Silicon dioxide (amorphous) 

3.9 

Glasses (of various compositions) 

3.7-10 

Castor oil 

4.5 

Silicon 

11.7 

Water (at 100°C) 

55.3 

Water (at 20°C) 

80.1 

Barium titanate (BaTi0 3 at 20°C ) 

-1,600 


Molecular 

polarizability 


In order to get some feeling of the physics behind these values, let us consider a very common 
model of a media whose elementary dipoles do not interact, so that in the relation P = np the elementary 
dipole moments p may be calculated independently of each other. This means that in a linear dielectric, 
in which Eq. (35) holds, each induced dipole moment p has to be proportional to the applied field E as 
well. Let us write this dependence in the following traditional form, 


P = 4 ^0«mol E > 


(3.39) 


where a mo \ is called the molecular (or, sometimes, “atomic”) polarizability, so that 


P = »P = 4^ 0 « mol »E . 


(3.40) 


Comparing this relation with Eq. (35), we get % e = Ana mo \n, so that Eq. (36) yields 15 


13 Note that in electrical engineering literature, the dielectric constant is often denoted by letters k or K. 

14 In Gaussian units, y e is defined by relation P = y e E, while s is still defined as D = sE. Because of that, e is 
dimensionless and equals (1 + Any,)- Note that (^Gaussian = (<&’/ £-'o)si = s r , and (ydst = 44j e ) Gaussian , sometimes 
creating a confusion with the numerical values of the latter parameter. 
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£ r =\ + 4na moX n. (3.41) 

Now let us consider the following toy model of a dielectric: 16 a set of similar conducting spheres 
of radius R, spread apart with small density n « l/R . At such low density of the spheres, their 
electrostatic interaction is negligible, and we can use Eq. (1 1) for the induced dipole moment of a single 
sphere. Then the polarizability definition (39) yields a mo \ = R \ so that Xe = 4 mR*, and 

s r =l + 47iR\. (3.42) 

Let us use this result for a crude estimate of the dielectric constant of air at the so-called ambient 
conditions, meaning the normal atmospheric pressure, and temperature T = 300 K. At these conditions 
the molecular density n may be, with a few-percent accuracy, found from the equation of state of an 
ideal gas: 17 n « P/kfT ~ (1.013xl0 5 )/(L38xl0" 23 x3 00) « 2.5xl0 23 m' 3 . The main component of air, 
molecular nitrogen N 2 , has a van-der-Waals radius 18 of 155 pm = 1 .55x10"'° m. Using it for R, from our 
crude model we get s r « 1.001. Comparing this number with the first line of Table 1, we see that our 
crude model gives surprisingly reasonable results: in order to get the exact experimental value, it is 
sufficient to decrease R by just ~25%. 

This result may encourage us to try using Eq. (42) for larger density n, i.e., beyond the range of 
its quantitative applicability. For example, as a crude model for solid and liquids let us assume that 
spheres fonn a simple cubic lattice with period a = 2R (i.e., the neighboring spheres almost touch). With 
this n = 1 /a = 1/8 R , Eq. (33) yields s r = 1 + 4 A 8 « 2.5. Due to the crude nature of this estimate, we 
may conclude that it provides a reasonable explanation for the values of s r , listed in first few lines of 
Table 1. Still, it is clear that such model cannot even approximately describe dielectric properties of 
either water or barium titanate (and similar materials), as well as their strong temperature dependence. 
Such high values may be explained by the molecular field effect each elementary dipole is polarized not 
only by the external field (as in our current toy model), but by the field of neighboring dipoles as well. 

Before analyzing this effect (in the next section), let us review how are the most important 
results of electrostatics modified by a uniform linear dielectric medium that obeys Eq. (38) with a space- 
independent dielectric constant s r . The simplest problem of this kind is a set of free charges of density 
p{r), inserted into the medium. For this case, we can combine Eqs. (32) and (38) to write 

V • E = — , i.e. V 2 </> = (3.43) 

£ £ 

For charges in vacuum, we had similar equations (1.27) and (1.41), but with a different constant, so = 
d£ r . Hence all the results discussed in Chapter 1 are valid, with both E and ^ reduced by the factor of £,-. 
Thus, the most straightforward result of the induced polarization of a dielectric media is the electric field 
reduction. This is a very important effect, especially taken into account the very high values of £,- in 
such dielectrics as water - see Table 1. Indeed, this is the reduction of the attraction between positive 


15 Note that for all materials listed in Table 1, s r > 1, i.e. a mo \ > 0. Actually, this is true for all stable dielectrics. 
Let me postpone a discussion of this fact until Sec. 5.5 where I will compare physical mechanisms of the electric 
and magnetic polarization. 

16 A more accurate model of atomic polarization is discussed in QM Chapter 6. 

17 See, e.g., SM Secs. 1.4 and 3.1. 

18 Such radius is defined by the requirement that the volume of the corresponding sphere, used in the van-der- 
Waals equation (see, e. g., SM Sec. 4.1) gives the best fit to the experimental equation of state n = n (P, T). 
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and negative ions (called, respectively, cations and anions ) in water that enables their substantial 
dissociation and hence almost all biochemical reactions, which are the basis of biological cell functions - 
and hence of the life itself. 


Now, what if the electric field in a uniform dielectric is induced by charges located on 
conductors - with potentials rather than charge density fixed? Then, with the substitution of the 
electrostatic potential definition E = -V0, Eq. (43) in the space between the conductors is reduced to the 
Laplace equation, and the boundary problem remains exactly the same as formulated in Chapter 2 - see 
Eqs. (2.35). Hence the potential distribution <jfr) is related to the conductor potential in exactly the same 
way as in vacuum (see, e.g., any problem discussed in Chapter 2), without any effect of the medium 
polarization. However, in order to find, from that distribution, the density <j of charges on conductor 
surfaces, we need to use the macroscopic Gauss law (34). Applying this equation to a pillbox-shaped 
volume on the conductor surface, we get the following relation, 


a = D n = sE n =~e^~, 
on 


(3.44) 


which differs from Eq. (2.3) only by the replacement s 0 — » s= s r so. Hence the charge density, calculated 
for the vacuum case, should be increased by the factor of s r - that’s it. In particular, this means that all 
the capacitances that had been calculated in vacuum, should be increased by that factor. For example, 
for planar capacitor filled with linear dielectric s r , we get the well-known formula 

C m of a 
planar 
capacitor 

(As a reminder, this increase of C m by s r has been already used in Sec. 2.2 for capacitance estimates.) 

Now let us discuss more complex situations in which the dielectric medium is not uniform, for 
example when it contains a boundary separating two regions filled by different uniform dielectrics. (The 
analysis is clearly applicable to a dielectric/vacuum boundary as well, with one of the dielectric 
constants set to 1.) For that, let us apply the macroscopic Gauss law (34) to a pillbox fonned at the 
interface between two dielectrics, with no surface charges - see the solid lines in Fig. 5. 


_ £ r £ 0 A _ sA 
m d d 


(3.45) 



Fig. 3.5. Deriving boundary conditions on the 
interface between two dielectrics: a Gauss 
pillbox and a circulation contour, n and x are 
the unit vectors which are, respectively, 
normal and tangential to the interface. 


Boundary 
condition 
for E„ 


This immediately gives (D„)i = (D , ,) 2 , so that Eq. (38) yields 



(3.46) 
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Now, what about the tangential component ( E r ) of the electric field? In dielectrics, static electric 
field is still potential, hence we can still use Eq. (1.28). Integrating this equation along to a narrow 
contour stretched along the interface (see the dashed line in Fig. 5), we get 

Boundary 
(3.47) condition 
for E r 

Note that this condition is compatible with (and may be derived follows from) the continuity of the 
electrostatic potential itself, (j)\ = (jh , at each point of the interface. That relation may be derived from the 
electric field definition as the gradient of ^ - see Eq. (1.33). Indeed, if the potential leaped at the border, 
the electric field would be infinite. 

Let us apply the boundary conditions (46)-(47), for example, to two thin (t « d) vacuum slits 
cut in a uniform dielectric with an initially uniform 19 electric field E 0 (Fig. 6). In both cases, a slit with t 
— > 0 cannot modify the field distribution outside it substantially. 



V D 0=^0 E 0 (b) 

(a) 


E = D /s 0 = /;,.E 0 

Fig. 3.6. Fields inside narrow 
slits cut in a linear dielectric. 


: 

D = £ 0 E = D 0 / s r 


For slit (a), normal to the applied field, we may apply Eq. (46) to the “major” (broad) interfaces, 
shown horizontal in Fig. 6, we see that D should be continuous. But according to Eq. (33), this means 
that inside the gap (i. e. in the vacuum, with P = 0) the electric field equals D/^o. This field, and hence 
D, may be measured, showing that the electric displacement is not a purely mathematical construct. 
Superficially, this result violates the boundary condition (47) on the vertical (“minor”) interfaces of the 
slit. Note, however, that the electric field within the gap is s r times higher than in the dielectric outside 
it. Hence the slit deforms the equipotential surfaces around it to concentrate the field inside itself. The 
curving of the surfaces near the minor interfaces takes care of the fulfillment of Eq. (47) at the minor 
interfaces. 

On the contrary, for slit (b) parallel to the applied field, we may apply Eq. (47) to the major 
(now, vertical) interfaces of the slit, to see that it is electric field E that is continuous now, while the 
electric displacement D = sE inside the gap is a factor of s r lower than its value in the dielectric. (Any 
perturbation of the field uniformity, caused by the compliance with Eq. (46) at the minor interfaces, is 
settled at distances ~ t from these interfaces.) 


19 Actually, selecting the slit size d much less that the characteristic scale of the field change, we can apply the 
following arguments to any external field distribution. 
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For problems with piecewise-constant s but more complex geometries we may need to apply the 
methods studied in Chapter 2. As in vacuum, in the simplest cases we can select such a set of orthogonal 
coordinates that the electrostatic potential depends on just one of them. Consider, for example, two types 
of plane capacitor filling with two different dielectrics - see Fig. 7. 

In case (a), voltage V between the electrodes is the same for each part of the capacitor, and at 
least far from the dielectric interface, the electric field is vertical, unifonn, and similar ( E = VI d). Hence 
the boundary condition (47) is satisfied even if such a distribution is valid near the surface as well, i.e. at 
any point of the system. The only effect of different values of s in the two parts is that the electric 
displacement D = sE and hence electrodes’ surface charge density cr= D are different in the two parts. 
Thus we can calculate the electrode charges Q\o of the two parts independently, in each case using Eq. 
(44), and then add up the results to get the total capacitance 

C m =^ L ^ L = Us,A I +s 2 A 2 ). (3.48) 

V d 

Note that this formula may be interpreted as the total capacitance of two separate capacitors connected 
(by conducting wires) in parallel. This is natural, because we may cut the system along the dielectric 
interface, without any effect on the fields in either part, and then connect the corresponding electrodes 
by external wires, again without any effect on the system - besides very close to capacitor’s edges. 



Fig. 3.7. Plane capacitors filled 
with two different dielectrics. 


Case (b) may be analyzed by applying Eq. (34) to a Gaussian pillbox with the lower lid inside 
the (for example) bottom electrode, and the top lid in any of the layers. From this we see that D 
anywhere inside the system should be equal to the surface charge density <j of the lower electrode, i.e. 
constant. Hence, in the top dielectric layer the electric field is constant: E\ = D\/s\ = o/s\, while in 
bottom layer, similarly, fG = DV = a/si. Integrating E across the whole capacitor, we get 


d, +d 9 


V = ^E(z)dz = E x d x + E 2 d 2 = 


d i d , 

— - + — 


vG 


cr, 


' 2 J 


(3.49) 


so that the mutual conductance per unit area 

9lS. = Z. 

A V 


t/j cl. 


- 1-1 


(3.50) 


Note that this result is equivalent to the total capacitance of a series connection of two plane 
capacitors based on each of the layers. This is natural, because we could insert an uncharged thin 
conducting sheet (rather than a cut as in the previous case) at the layer interface, which is an 
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equipotential surface, without changing the field distribution in the system. Then we could thicken the 
conducting sheet as much as we like (turning it into a wire), also without changing the fields and hence 
the capacitance. 

In order to warm up for more complex problems, let us see how the last problem could be solved 
using the Laplace equation approach. Due to the symmetry of the system, the electrostatic potential in 
each layer may only depend on one (in Fig. 7b, vertical) coordinate z, so that the Laplace equation in 
each uniform part of the system is reduced to d~(pld z = 0. Hence in each layer the electrostatic potential 
changes linearly, though possibly with different coefficients: tp\ = c\\z + cn, and (p 2 = ci\z + C 22 . 
Selecting the electrode potentials as (p (0) = 0 and (j) (cl\+ di) = V, from those boundary conditions we get 
c 1 2 = 0, c'2i(<7i+<7 2 ) + C 22 = V, so that we need two more equations to find all four coefficients c,y. These 
additional equations come from the conditions of continuity of the potential ( c\\d\ + C 12 = C 2 \d\ + C22 ) 
and displacement ( S\C\ \ = £2021) at the interface z = d\. Solving these equations, we can readily find the 
electric field and displacement in both layers, then the surface charge densities 


a (0) = D z= 0 = ~ e 


d(p x 

dz 


z= 0 ’ 


a(d l + d 2 ) = D 


dtp 2 

z=d l +d 2 2 


z=d^+d 2 


(3.51) 


(which in this case are equal and opposite) and finally the capacitance per unit area, with (of course) the 
same result (50). 

Let us apply the same approach to a more complex problem, shown in Fig. 8a: a dielectric sphere 
placed into a uniform external electric field E 0 . 



Fig. 3.8. Dielectric sphere in an initially uniform electric field: (a) the problem, and (b) the 
equipotential surfaces, as given by Eq. (55), for s r = 3. 


In this case the Laplace equation is not one-dimensional, and hence invites the variable 
separation method discussed in Sec. 2.5. From that discussion we already know, in particular, the 
general solution (2.172) of the Laplace equation outside of the sphere. To satisfy the uniform-field 
condition at r — » 00 , it reduces to 

oc L 

</>r>R =~ E 0 r COS0 + J]— ^^(costf). (3.52) 

i=\ r 
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Inside the sphere we can only use the radial functions that are finite at r — > 0: 

oo 

(f>r<R = X a / r/ ^( C0S ^) • ( 3 - 53 ) 

1=1 

Now, writing the boundary conditions (46) and (47) at r = R, we see that for all coefficients a/ and bi 
with / > 2 we (just like for the conducting sphere in vacuum) get homogeneous equations that have only 
trivial solutions. Hence, all these terms may be dropped, while for the only surviving angular hannonic, 
proportional to ^?( cos#) = cos# Eqs. (46)-(47) yield two equations: 


2b, 


b i 


~ E 0 - — = s r a 1 , -E 0 R + — = a x R. 

R R 

Solving this simple system for a\ and hi, we get the final solution of the problem: 


^r>« — E 0 


- r + ■ 


e r -1 R 
+ 2 r l 


3 h 


cos# 


^r<JR — E 0 


£ + 2 


-r cos# . 


(3.54) 


(3.55) 


Figure 8b shows the equipotential surfaces given by this solution, for a particular value off the 
dielectric constant £,-. Note that, just like for a conducting sphere, at r > R the dielectric sphere produces 
(on the top of the uniform external field) a purely dipole field, with p = 4 nR £oE 0 (£ r - 1 )/(s r + 2) - an 
evident generalization of Eq. (11), to which our result tends at s, — > oo. By the way, this property is 
common: from the point of view of their electrostatic (but not transport!) properties, conductors may be 
adequately described as dielectrics with s r — > oo. 

Another remarkable feature of Eqs. (55) is that the electric field inside the sphere is uniform 20 
with ^-independent values 

E = ~rE 0 , D = £ r s 0 E = £ 0 ^— E 0 , P = D - £ 0 E = 3^ 0 — ^-E 0 . (3.56) 

£,. +2 £ r + 2 £ r + 2 


In the limit £,. — > 1 (the “vacuum sphere”, i.e. no sphere at all), the electric field inside the sphere 
naturally tends to the external one, and its polarization disappears. In the opposite limit and £ r — > oo the 
electric field inside the sphere vanishes, and the field outside the sphere approaches that we have found 
for the conducting sphere - see Eq. (2.176). 

To complete the discussion of this example, note a very curious result: the field E se if, created by 
the dielectric sphere inside itself, is related to its polarization vector by a simple equation independent of 
either the dielectric constant or sphere’s size: 

(3.57) 

£,. + 2 3£ 0 

where factor 3 stems sphere’s dimensionality. (For a round cylinder in a normal external field, the 
similar relation is valid, but with factor 2.) This equality is just the particular manifestation of the 
general relation (24). Indeed, if summed over all N= nV similar dipoles p, distributed inside the sphere 
with constant density n (so that the polarization vector P = np is constant), Eq. (24) yields 


20 This is true for any ellipsoid, at arbitrary external field orientation. 


Chapter 3 


Page 15 of 26 





Essential Graduate Physics 


EM: Classical Electrodynamics 


{E self (r)jV = --?-F ! (3.58) 

v ^0 

so that after division by V, and taking into account the field uniformity in our particular case, it 
coincides with Eq. (57). 21 We will use these results in the following section to discuss the molecular 
field effect. 

Before doing that, let me briefly revisit the method of charge images that was discussed in Sec. 
2.6, to find its new features pertaining to dielectrics. As the simplest example, consider a point charge 
near a dielectric half-space - see Fig. 9 (cf. Fig. 2.24). 



Fig. 3.9. Charge images for a dielectric half-space. 


The Laplace equations in the upper half-space z > 0 (besides the charge point p = 0, z = d) may 
still be satisfied using a single image charge q ’ at point p = 0, z = - d, but now q ’ may differ from (-q). 
In addition, in contrast to the conducting plane case, we should also find the field inside the dielectric (z 
< 0). This field cannot be contributed by the image charge, because it would provide a potential 
divergence at its location. Thus, in that half-space we should try to use the real point source only, but 
maybe with a re-normalized charge q ” rather than the genuine charge q - see Fig. 9. As a result, we may 
look for the potential distribution in the form 


q 


q’ 


<j)(p,z) = 


4 n£„ 


■x< 


( p~+(z-d )~ ) {p~ + (z + d) 2 ) 

q" 

(p'Hz-dfY’ 


for z > 0, 
for z < 0, 


(3.59) 


at this stage with u nkn own q ’ and q Plugging this solution into the boundary conditions (46) and (47) 
at z = 0 (with d/dn = d/dz), we see that they are indeed satisfied (so that Eqs. (59) express the unique 
solution of the boundary problem) if the effective charges q ’ and q ’ ’ obey the following relations: 


21 The reader may wonder how have we managed to proof Eq. (24), at least for this particular case, using only the 
relations based on the dipole approximation (7) for the field, which does not cover the inter-dipole fields 
responsible for Eq. (24) - see Fig. 3 and its discussion. The reason is that according to Eq. (30), the additional 
field E self inside the sphere may be considered as been created by effective charges, of density p ef , distributed on 
sphere’s surface. For these charges, field E e f is internal, similar to the field between two charges, shown in Fig. 3. 
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q-q' = s r q", q + q' = q" . 


Solving this simple system of linear equations, we get 


q = 


G - 1 
£.. +1 


q, q = 


£,. + 1 


q . 


(3.60) 


(3.61) 


If s r — » 1 , then q ’ — > 0, and q” — » q - both facts very natural, because in this limit (no 
polarization!) we have to recover the unperturbed field of the initial point charge in both semi-spaces. In 
the opposite limit s r — > go (which, according to our discussion of the last problem, should correspond to 
a conducting plane), q’—>q (repeating the result we have discussed in very much detail in Sec. 2.6) and 
q” — > 0. According to the second of Eqs. (3.59), the last result means the field in the dielectric tends to 
zero in this limit, as it should. 

Finally, following the logic of Chapter 2, at this point it would be appropriate to discuss the 
Green’s function method. However, due to the time/space restrictions, I will skip this discussion, 
especially because the all the method’s philosophy remains absolutely the same as for the vacuum case, 
so that the generalization to the case of dielectrics is straightforward. 


3.4, Molecular field effects 

In 1850, O.-F. Mossotti and (probably, independently, but almost 30 years later!) R. Clausius 
made an interesting experimental observation known now, rather unfairly, as the Clausius-Mossotti 
relation : if density n of molecules in a chemical compound may be changed without changing its 
molecular structure, then the following ratio, 


£ r _zl 

£,. + 2 


(3.62) 


is approximately proportional to n. For 1, i.e., n — » 0, there is no surprise here: according to Eq. 
(41), for independent molecular dipoles s r - 1 = 4 na mo in oc oc n. However, at larger density n, the 
effective field E e f, acting on each dipole, includes not only the external field Eo, but also a substantial 
“molecular field” E mo i of the surrounding dipoles: 

E„ =E„+E„,(0), (3.63) 


where the position of the particular dipole we are discussing is taken for r = 0. Let us calculate E mo /(0), 
using a very simple model: a regular cubic lattice of identical dipoles (Fig. 10). In a Cartesian 
coordinate system with axes directed along the lattice vectors, coordinates of the dipoles are 

Xju = aj, y jkl = ak, z Jkl = al , (3.64) 


where j, k, and / are the integers numbering the dipoles. Now we may use the last form of Eq. (13), and 
the linear superposition principle, to calculate one of the Cartesian components (say, along axis x) of the 
molecular field induced by all other dipoles of the lattice: 


(O,(o) 


1 


4 7T£ n a 3 


I 

0^ j,k,l=- CO 


3 j(jp x + kp +lp z )-p x (j +k~ +1 ) 


u z +k z +n 


2x5/2 


(3.65) 
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with excluded term j = k = l = 0 is excluded. The sums of all cross-terms, proportional to jk and jl, 
vanish due to system symmetry, so that Eq. (65) reduces to 


(O.,(o) 


1 


z 

j,k,l=- co 


3.7 2 ~(./ 2 + k 2 +/ 2 ) 


47T£ n a 3 (j 2 +k 2 +I 2 ) 512 


(3.66) 


Since all the sums participating in this expression are equal, 


Z 


•2 


J 


(. f+k z +n 


2 \ 5/2 


^ 2 , k 2 

/ .2 . / 2 . / 2x5/2 

j,k,l=-oo \J +k +/ ) 


Z 


/ 2 

if +k 2 +i 2 ) 512 ’ 


(3.67) 


we get (iimoi) A (0) = 0. Due to the system symmetry, the same result is valid for all other components of 
the dipole field. Hence, E mo i(0) = 0, and (due to the equivalence of all the dipoles of the system), the 
molecular field vanishes at the location of each dipole, so that Eq. (3.63) is reduced to E ef = E 0 . 



Fig. 3.10. Cubic lattice of similar dipoles. 


In order to relate the external field E 0 and the average dipole 22 field E in the medium, we may 
use Eq. (56) for a uniform, macroscopic sphere 23 with a radius much larger then the inter-dipole distance 
a, so that our assumption of infinite limits of the rapidly converging sum (65) is not substantially 
affected: 

E = ^-E a =^-E rf . (3.68) 

£ r +2 £ r +2 

Now we may plug this relation into the general formula (37) for linear dielectrics: 

P = (ff,-l)s 0 E = fcT £ „E It . (3.69) 

£ r +2 

This “macroscopic” relation has to give the same result as the “microscopic” Eq. (40) - with the 
replacement E — > E e f which reflects the fact that in the general case each dipole is polarized by the 
effective field (63) rather than the average field E: 

P = 4/r£ 0 a mol nE ef . (3.70) 


22 This qualifier is important: E is the long-range (dipole field) average participating in the macroscopic Maxwell 
equations, rather than the exact average that would include the inner-dipole fields, for which Eq. (24) would be 
valid. 

23 This geometry, due to its isotropy, most fairly represents the relation between E and E 0 . 
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The comparison yields the so-called Lorentz-Lorenz formula, 24 

Lorentz- 
Lorenz 
formula 



(3.71) 


that complies with the Clausius-Mossotti observation, provided that the molecular polarizability a mo \ is 
independent of density. (This is a good approximation at least for weak “molecular” bonding.) 

It is somewhat surprising how many dielectric materials obey Eq. (71) rather well, because of its 
approximate nature. Indeed, its derivation is based on the assumption of a specific crystal lattice and, 
more importantly, that the molecules are localized exactly in the crystal lattice nodes, and the field of 
each molecule may be expressed by the dipole approximation. In reality, atom’s electrons, which 
participate in the dipole moment formation, are spread in space due to quantum-mechanical uncertainty 
on a scale that may be comparable with distances between the molecules. 

Solving Eq. (71) for the dielectric constant, we get 




l + 8 ^,nol »/3 
« /3 


(3.72) 


If the dipole density is low, a mo in « 1, we get our old result (41) corresponding to independent dipoles, 
and hence to E e f = E. However, at high dipole density and/or polarizability, the effective field acting on 
the each dipole, 


E ef =■ 


£ + 2 „ E 


-E = ■ 


l -4xa mo] n/3 


(3.73) 


may be substantially larger than the average field E, due to the molecular field contribution. Note s r , the 
E e f/E ratio, and hence the electric susceptibility 




P 

S 0 E 


4 ^mol» 

n/3 ’ 


(3.74) 


all diverge when the density-polarizability product approaches the critical value a mo \n = 3/4 n. 

This is essentially a rudimentary 25 description of the transition from linear dielectrics to the so- 
called ferroelectrics with self-sustained ( spontaneous ) polarization even in the absence of external 


24 It was derived by in 1869 by L. Lorenz and then (in 1878) independently by H. Lorentz. Actually, they 
discussed optical frequencies at which s r should be understood as the square of the refraction coefficient at the 
wave frequency (see Chapter 7), but since the optical wavelengths ~ 10" 4 m are much longer than interatomic 
distances a ~ 10‘ 9 m, the derivation remains absolutely the same in electrostatics. 

25 Any quantitative description of this transition should account of for thermal fluctuations of the molecular 
dipoles, which reduce the dipole-dipole ordering and hence suppress the transition to the ferroelectric phase until 
temperature has been lowered to a certain Curie temperature T c - named after P. Curie (1859-1906). Right above 
that temperature, the dielectric remains linear, but has a high, temperature-dependent dielectric constant that 
diverges at T — > T c . Such materials are frequently called paraelectric, and the paraelectric-to-ferroelectric 
transition at T c in crystals is a typical example of a continuous (or “second-order”) phase transition - see, e.g., 
SM Sec. 4.4. (As will be discussed in Sec. 5.5 below, some magnetic materials exhibit a very similar phase 
transition between their ferromagnetic and paramagnetic phases.) Moreover, in non-crystalline materials, such as 
bulk ceramics and thin films, the ferroelectric behavior is further complicated by different, field-dependent 
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electric field. These materials are typically recognized by the hysteretic behavior of their polarization as 
a function of applied electric field - see, for example, Fig. 1 1 . 

Ferroelectric materials are being actively explored as the active materials for nonvolatile 
random-access memories (dubbed either FRAM or FeRAM). 26 In cells of this memory, binary 
information is stored in the form of one of two possible directions of spontaneous polarization at E = 0 - 
see, e.g., Fig.il, and is read out by the effect of the average electric field on a nearby semiconductor 
field-effect transistor. Unfortunately, most materials suitable for fabrication of ferroelectric thin films 
are rather complex and incompatible with standard processes of microelectronics. In addition, the time 
of spontaneous depolarization of ferroelectric thin films is typically well below than 10 years - the 
industrial standard for data retention in nonvolatile memories, and this time may be decreased even 
more by “fatigue” from repeated polarization recycling. Due to these reasons, industrial production of 
FRAM is currently just a tiny, few-percent fraction of the nonvolatile memory market (which is 
currently dominated by floating-gate memories - see Sec. 4.2). 

(a) (b) 




Fig. 3.11. Ferroelectric hysteretic loops: (a) for various material types 
(schematically), and (b) for several amplitudes of the applied ac electric field. 

(Panel b, showing recent (2013) experimental results by S.-W. Jung et al. for an 
inkjet-printed layer of organic semiconductor PC12TV12T, is adapted from 
http://etrij.etri.re.kr/etrij/iournal/article/article.do?volume=35&issue=4&t>age=734 .) 

Other polarization effects can also be met, possible, e.g., antiferroelectricity or helielectricity. 
Unfortunately, we will not have time for a discussion of these exotic phenomena in this course; 27 the 
main reason I am mentioning them is to emphasize again that the “material relation” P = P(E) is by no 
means exact or fundamental, though most material, in practicable fields, behave as linear dielectrics. 


directions of polarization P in individual “domains” of the sample, making the average hysteresis more smooth 
(Fig. 11a) and dependent on sample’s polarization history - for example the amplitude of the applied ac electric 
field (Fig. 1 lb). 

26 See, e.g., J. F. Scott, Ferroelectric Memories , Springer, 2000. 

27 For a detailed coverage of ferroelectrics, I can recommend an encyclopedic monograph by M. Lines and A. 
Glass, Principles and Applications of Ferroelectrics and Related Materials, Oxford U. Press, 2001, and a recent 
collection of reviews by K. Rabe, C. Ahn, and J.-M. Triscone (eds.), Physics of Ferroelectrics: A Modern 
Perspective, Springer, 2010. 
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3.5. Energy of electric field in a dielectric 

In Chapter 1, we have obtained two key results for the electrostatic energy: Eq. (1.54) for a 
charge interaction with an independent (“external”) field, and a similarly structured formula (1.62), but 
with an additional factor Vi, for the field is produced by the charges under consideration. Both relations 
could be merged and rewritten in a “local” form involving energy density u - see Eq. (1.67). These 
equations are of course always valid for dielectrics as well if the charge density includes all charges 
(including those bound into dipoles), but it is convenient to recast them unto a form depending on 
density p{ r) of only “stand-alone” charges. 

If a field is created only by stand-alone charges under consideration, and is proportional to p{ r) 
(requiring that we deal with a linear dielectric!), we can repeat all the argumentation of the beginning of 
Sec. 1.3, and again arrive at Eq. (1.62), provided that ^is calculated correctly, i.e., with a due account of 
the dielectric. Now we can recast this result in terms of fields - essentially as this was done in Eqs. 
(1.64)-(1.66), but now making a clear difference between the electric field E (that still equals -V0) and 
the electric displacement field D that obeys the macroscopic Maxwell equation (32). Plugging p(r), 
expressed from that equation, into Eq. (1.62), we get 

U = ij(V-D)^ d 3 r. (3.75) 

Using the fact 28 that for any differentiable functions (j) and D, 

(V • D) ^ = V • (^ D) - ( V 0) • D , (3.76) 

we may rewrite Eq. (75) as 

U = | J V {(j) D)c/ V - 1 J (V</>) • D d 3 r. (3.77) 

The divergence theorem, applied to first term, reduces it to a surface integral of (j)D n . (As a reminder, in 
Eq. (1.65) the integral was of $V^)„ oc <pE n .) If the surface of the volume we consider is sufficiently far, 
this surface integral vanishes. On the other hand, the gradient in the second term of Eq. (77) is just 
(minus) field E, so that it gives 

U = - f E • D d V = - [ E(r) ■ e(r)E(r) d\ = ^ [ £,.(r)E 2 (r) d 3 r . (3.78) 

2 J 2 J 2 J 

This expression is a natural generalization of Eq. (1.67) and shows that we can, like we did in vacuum, 
present the electrostatic energy in a local form 29 


(3.79) 


Again, this expression is not valid for nonlinear dielectrics, because our starting point, Eq. 
(1.62), is only valid if (j) is proportional to p. In order to make our calculation more general, we should 


Field 
energy in 
a linear 
dielectric 



28 See, e.g., MA Eq. (1 1.4a). 

29 Again, in Gaussian units this expression should be divided by Arc. 
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intercept our calculations in Sec. 1.3 at an earlier stage, at which we have not yet used this 
proportionality. For example, Eq. (1.54) may be rewritten, in the continuous limit, as 

SU = \</>(r)Sp(r)d 3 r, (3.80) 

where symbol 5 means a small variation of the function - e.g., its change in time, sufficiently slow to 
ignore the relativistic and magnetic-field effects. Applying such variation to Eq. (32), and plugging the 
resulting 8p = V- SD into Eq. (80), we get 

8U = J(V -SD^d'r. (3.81) 

(Note that in contrast to Eq. (75), this expression does not have factor Vi.) Now repeating the same 
calculations as in the linear case, for the energy density variation we get a remarkably simple (and 
general!) expression, 

General 
(3.82) energy 
variation 

This is as far as we can go for the general dependence D(E). If the dependence is linear and 
isotropic, as in Eq. (38), then SD = sSE and 

Su=eE-SE (3.83) 

Integration of this expression over variations, from zero field to a certain final distribution E(r), brings 
us back to Eq. (79). 

Another important role of Eq. (82) is that it shows that Cartesian coordinates of E may be 
interpreted as generalized forces, and those of D as generalized coordinates (per unit volume). 30 This 
allows one to form the proper Gibbs potential energy 31 of a system inside some volume V, placed in an 
external electric field E ext : 

Gibbs 

(3.84) potential 
energy 

As an analytical mechanics reminder, if a generalized external force (in our case, E ex t) is fixed, 
the stable equilibrium of the system corresponds to the minimum of ft, rather than of the potential energy 
U as such - in our case, that of the field: 

U = Ju(r)c/V, w(r) = Je • <5D . (3.85) 

V 

In order to illustrate this important point, let us return to the simple case of a system with linear 
dielectric(s), in which SD oc SE oc <5E e xt, so that Eq. (85) may be explicitly integrated over the external 
field variation, to reproduce the second of Eqs. (79): 


f = j g( r )d 3 r, g(r)= w(r)-E ext - D. 

r 


8a = E-SD . 


30 This is the point where the SI units, prescribing fields E and D different dimensionalities, are more revealing 
than the Gaussian units. 

31 See, e.g., CM Sec. 1.5. Note that as Eq. (84) clearly illustrates, once again, that the difference between potential 
energies ft and U, usually discussed in courses of statistical physics and/or thermodynamics as the difference 
between the Gibbs and Helmholtz free energies (see, e.g., SM 1.6), exists regardless of statistics or thermal 
motion. 
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u(r) = ^E-D. (3.86) 

In this case, Eq. (84) yields 

«(>•) = \ E • D - E„, • D = e -E'- - eE ■ E„ s |(E - E„, ) 3 + const , (3.87) 

where the constant may depend on the external field, but not on the resulting field distribution. As a 
sanity check, let us apply this result to a volume V well inside a long dielectric cylinder placed into a 
uniform external field E ex t parallel to cylinder’s axis. (Such orientation is important to ignore the 
geometric effects discussed in Sec. 3 - see, e.g., Fig. 6 and its discussion.) Then E has to be uniform in 
the dominating part of the cylinder, so that Eq. (84) may be explicitly integrated over the volume, 
giving 

^ = “(E - E ext ) 2 V + const . (3.88) 

The minimum of this function is achieved at the evidently correct result E = E ext - in contrast to the 
unphysical result E = 0 (meaning electric field’s expulsion from the volume) that we would get 
minimizing U. 


3.6. Exercise problems 

3.I .* Prove the following extension of Eq. (5): 


A r ) ! 


1 


4 ns n 


J 1 j = i 11 jf = i 




where Q is a scalar - the total charge of the system, pj are the Cartesian components of a vector - 
system’s dipole moment (6), and Qjj> are Cartesian components of a tensor - system’s quadrupole 


moment : 


Q = J p(r')d 3 r', Pj = J p(r')r'/ V, <?, = J p(r 'fcr'fj. - . 


3.2 . A plane, thin ring of radius R is charged with a constant linear density X. Calculate the exact 
distribution electrostatic potential distribution along the symmetry axis of the ring, and prove that at 
large distances, r » R, it is indeed described by the multipole expansion spelled out in Problem 1 . 

3.3 . Without carrying out an exact calculation, can you predict the spatial dependence of the 
interaction between various electric multipoles, including point charges (in this context, frequently 
called monopoles), dipoles, and quadrupoles? Based on these predictions, what is the functional 
dependence of the interaction between dumbbell-shaped diatomic molecules such as TE, N 2 , O 2 , etc., on 
the distance between them, if the distance is much larger than the molecular size? 

3.4 . In suitable reference frames, calculate the dipole and quadrupole moments of the following 
systems (see Figs, below): 
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(i) 4 point charges of the same magnitude, but alternating signs, placed in the corners of a 

square; 

(ii) a similar system, but with a pair charge sign alternation; and 

(iii) a point charge in the center of a thin ring carrying a similar but opposite charge, uniformly 
distributed along its circumference. 



3.5 . Two similar electric dipoles, of fixed magnitude p, located at fixed distance r from each 
other, are free to rotate, changing their directions. What stable equilibrium position(s) may they take as a 
result of their electrostatic interaction? 


3.6 . An electric dipole is located above an infinite conducting 
plane (see Fig. on the right). Calculate: 

(i) the distribution of the induced charge in the conductor, 

(ii) the dipole-to-plane interaction energy, and 
(ii) the force and the torque acting on the dipole. 



(j) = 0 


3.7 . Use two different approaches to calculate the energy of interaction between a grounded 
conductor and an electric dipole p, placed in the center of a spherical cavity of radius R, carved in the 
conductor. 

3.8 . A plane separating two parts of otherwise free space is densely and unifonnly (with constant 
areal density n) filled with dipoles, with their dipole moments p oriented in a direction normal to the 
plane. 

(i) Calculate the boundary conditions for the electrostatic potential on both sides of the plane. 

(ii) Use the result of Task (i) to calculate the potential distribution created in space by a spherical 
surface, with radius R, densely and uniformly filled with radially-oriented dipoles. 

(iii) What condition that should be imposed on the dipole density n for your results to be 
qualitatively valid? 

3.9 . A plane capacitor, with zero voltage between its 
conducting plates (as may be fixed, e.g., with an external 
wire - see Fig. on the right), is partly filled with a material c j T 
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with spontaneous, constant polarization P 0 . 32 Find the distributions of the electric field, electric 
displacement, and the surface charge density of each plate. 

3.10 . A sphere of radius R is made of a material with a uniform, fixed polarization Po. 

(i) Calculate the electric field everywhere in space - both inside and outside the sphere. 

(ii) Explore the limit R — > 0, keeping PqR 3 = const, and compare the result with Eq. (25). 

3.1 1 . Discuss the physics of Eq. (3.85) of the lecture notes, in particular the physical nature of 
the potential energy U in a dipole medium. Apply your conclusion to a material with fixed (field- 
independent) polarization Po, and calculate the electric field energy of the uniformly polarized sphere 
considered in the previous problem. 

0.04 
0.03 

3.12 . Experimental plots in Fig. on the right show that "g 002 
the polarization of EuMmOs, a typical Q 0.01 
ferroelectric/paraelectric material, becomes almost linear at | 0.00 
50 K. Use the plot to calculate (with an accuracy better than | .0.01 

nj 

10%) its dielectric constant s r at this temperature. ° .0.02 

- 0.03 
- 0.04 

-8000 -6000 -4000 -2000 0 2000 4000 6000 8000 

Electric field (V/cm) 

3.13 . In two separate experiments, a thin, plane sheet of a linear dielectric with s r = const is 
placed into a uniform external electric field Eo: 

(i) with sheet’s surface parallel to the electric field, and 

(ii) the surface perpendicular to the field. 

For each case, find the electric field E, the electric displacement D, and the polarization vector P inside 
the dielectric (far from sheet’s edges). 

3.14 . A point charge q is located at distance r » R from the center of a uniform sphere of radius 
R, made of a uniform linear dielectric. In the first nonvanishing approximation in small parameter R/r, 
calculate the interaction force, and the energy of interaction between the sphere and the charge. 

3.15 . A fixed dipole p is placed in the center of a spherical cavity of radius R, cut inside a 
uniform, linear dielectric. Calculate the electric field distribution everywhere in the system (both for r < 
R and r > R). 

Hint : You may start with the assumption that the field at r > R has a distribution typical for a 
dipole (but be ready for surprises :-). 



32 In electrical engineering, such materials (typically, synthetic polymers) are frequently called electrets. As an 
approximation, this condition may be applied to hard ferroelectrics, if the external or self-induced electric fields 
are not too high. 
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3.16 . A spherical capacitor (see Fig. on the right) is filled with a linear 
dielectric whose permittivity s depends on spherical angles 6 and (p, but not on 
the distance r from system’s center. Give an explicit expression for its 
capacitance C. 



3.17 . For each of the two capacitors shown in Fig. 3.7 of the lecture notes, calculate the electric 
forces (per unit area) on the boundaries of two uniform dielectrics, in tenns of the electric fields. 


3.18 . A uniform electric field Eo has been created (by external 
sources) inside a uniform linear dielectric. Find the change of the electric 
field, created by cutting out a cavity in the shape of a round cylinder of 
radius R, with the axis perpendicular to the external field - see Fig. on the 
right. 



3.19 . Small linear-dielectric particles of spherical shape are dispersed in free space with low 
concentration n « \/R , where R is particle's radius. Calculate the average dielectric constant of such a 
medium. Compare the result with the apparent, but incomplete answer 

s r -\ = {s r -\)nV , 

(where s r is the dielectric constant of particle's material and V = ( 4 tt/ 3 ) A is its volume), and explain the 
origin of the difference. 

4c .... . .. 

3.20 . Calculate the spatial distribution of the electric potential induced by a 
point charge q is placed at distance d from a very wide parallel plate, of thickness D, 
made of a linear dielectric - see Fig. on the right. 


D 


■ 4 


q 


d 
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Chapter 4. DC Currents 

In this chapter I discuss the laws governing the distribution of constant (“ dc ”) currents inside conducing 
media, with a focus on the linear (“Ohmic”) conductivity. In most cases, the partial differen tial equation 
governing the distribution may be reduced to the same Laplace and Poisson equations whose solution 
methods have been discussed in detail in Chapter 2. Due to this fact, this chapter is rather short. 


4. 1 , Continuity equation and the Kirchhoff laws 

Until this point, our discussion of conductors has been limited to the cases when they are 
separated with insulators (meaning either vacuum or dielectric media) preventing any continuous 
motion of charges from one conductor to another, even if there is a voltage difference (and hence 
electric field) between them - see Fig. la. 



Fig. 4.1. Two oppositely charged conductors: (a) at the electrostatic situation, (b) at charge relaxation 
through an additional narrow conductor (“wire”), and (c) a system sustaining dc current in the wire. 


Now let us connect two conductors galvanically, say with a wire - a thin, elongated conductor 
(Fig. lb). Then the electric field causes the motion of charges in the wire - from a conductor with a 
higher electrostatic potential toward that with a lower potential, until the potentials equilibrate. Such 
process is called charge relaxation. The main equation governing this process may be obtained from the 
experimental fact (already mentioned in Sec. 1.1) that electric charges cannot appear or disappear 
(though opposite charges may recombine with the conservation of the net charge.) As a result the 
change of charge Q in one conductor may change only due to the current / through the wire: 1 


dQ 

dt 


= -/. 


(4.1) 


1 Just as a (hopefully, unnecessary :-) reminder, in the SI units the current is measured in amperes (A). In the legal 
metrology, the ampere (rather than the coulomb, which is defined as 1C = 1A x Is) is a primary unit. I will 
mention its formal definition in the next chapter. In the Gaussian units, Eq. (1) remains the same, so that the 
current’s unit is the so-called statampere - defined as statcoulomb per second. 
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Let us express this law in a differential form, introducing the notion of current density vector 
j(r). This vector may be defined via the following relation for current dl crossing an elementary area dA 

(Fig- 2) 

dl = jdA cos 0 = ( / cos 6)dA = j n d A , (4.2) 

where 0 is the angle between the normal to the surface and the carrier motion direction (which is taken 
for the direction of vector j). 


► 

► j 

► Fig. 4.2. Current density vector. 


dA cos 0 



With that definition, Eq. (1) may be re-written as 


d_ 

dt 


J pddr 



Jn d : 


s 


(4.3) 


where V is an arbitrary stationary volume limited by the closed surface S. Applying to this volume the 
same divergence theorem as was repeatedly used in previous chapters, we get 



(4.4) 


Since volume V if arbitrary, this equation may be true only if 


dp 

dt 


+ V-j = 0. 


(4.5) 


This is the fundamental continuity equation - which is true even for the time-dependent phenomena. 2 


The charge relaxation is of course a dynamic, time-dependent process. However, electric 
currents may also exist in stationary situations, when a current source, for example a battery, replenishes 
the conductor charges and hence sustains currents at a certain time-independent level - see Fig. lc. (As 
we will see below, in most cases this process requires a persistent replenishment of the electrostatic 
energy from either a source or storage of energy of a different kind - say, the chemical energy of the 
battery.) Let us discuss the laws governing the distribution of such dc currents. In this case (6/dt = 0), 
Eq. (5) reduces to a very simple equation 


Vj = 0. 


(4.6) 


This equation acquires an even a simpler form in the particular but important case of electric 
circuits (Fig. 3), the systems may be presented as an electric connection of components of two types: 


2 Similar differential relations are valid for the density of any conserved quantity, for example for mass in 
classical fluid dynamics (see, e.g., CM Sec. 8.2), and for the probability in statistical physics (SM Sec. 5.6) and 
quantum mechanics (QM Sec. 1.4). 


Continuity 

equation 
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(i) small-size ( lumped) circuit elements (also called “two-terminal devices”), meaning a passive 
resistor, a current source, etc. - generally, any black box with two wires sticking out of it, and 

(ii) perfectly conducting wires, with negligible voltage drop along them, that are galvanically 
connected at certain points, called nodes (or “junctions”). 



Fig. 4.3. Typical system obeying the Kirchhoff 
laws. 


In the standard circuit theory, the electric charges of the nodes are considered negligible, and we 
may integrate Eq. (6) over the closed surface drawn around any node to get 

X 7 , = 0 ’ ( 4 - 7a ) 

j 

where the summation is over all the wires (numbered with index j) connected in the node. On the other 
hand, according to its definition (2.25), voltage drop Ft across each circuit element may be presented as 
the difference of potentials of the adjacent nodes, Ft = <j) k - (f- \ . Summing such differences around any 
closed loop of the circuit (Fig. 3), we get all terms cancelled, so that 

T v i =° ■ < 47b > 

k 

These relations are called, respectively, the 1 st and 2 ml Kirchhoff laws - or sometimes the node 
rule and the loop rule. They may seem elementary, and the genuine power of the Kirchhoff approach is 
in the fact a set of Eqs. (7), covering every node and every circuit element of the system, gives a system 
of equations sufficient for the calculation of all currents and voltages in it - provided that the relation 
between current and voltage in known for each circuit element. 

It is almost evident that in the absence of current sources, the system of equations (7) has only a 
trivial solution: f = 0, Ft = 0 - with the exotic exception of superconductivity, to be discussed in Sec. 
6.3. The current sources, that allow non-vanishing current flows, may be described by their 
electromotive forces ( e.mf ) V k , having the dimensionality of voltage, which have to be taken into 
account in the corresponding terms Ft of sum (7b). Let me hope that the reader has some experience of 
using Eqs. (7) for the analysis of simple circuits - say consisting of several resistors and dc batteries - 
so I may save time on a discussion of these simple problems. 


4.2. The Ohm law 

As was mentioned above, the relations spelled out in Sec. 1 are sufficient for forming a closed 
system of equations for finding currents and electric field in a system only if they are complemented 
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with material equations relating scalars / and V in each circuit element, i.e. vectors j and E in each 
point of the medium of such an element. The simplest, and most frequently met relation of this kind is 
the famous Ohm law whose differential form is 


j = o-E, 


(4.8) 


where cr is a constant called conductivity . 3 Though this is not a fundamental relation, and is 

approximate for any conducting media, we can argue that if: 


(i) there is no current at E = 0 (mind superconductors!), 

(ii) the medium is isotropic or almost isotropic (a notable exception: some organic conductors), 

(iii) the mean free path / of current carriers is much smaller than the characteristic scale a of the 
spatial variations of j and E, 


then the Ohm law may be viewed as a result of the Taylor expansion of the local relation j(E) in 
relatively small fields, and thus is very common. 

Table 1 gives the experimental values of dc conductivity for some practically important (or just 
representative) materials. The reader can see that the range of its values is very broad, covering more 
that 30 orders of magnitude, even without going to such extremes as very pure metallic crystals at very 
low temperatures, where cr may reach ~10 S/m. 


Table 4.1. Ohmic conductivities for some representative (or practically important) materials at 20°C. 


Material 

cr (S/m) 

Teflon ([C 2 F 4 1„) 

1 0‘ 22 - 1 0‘ 24 

Silicon dioxide 

10 16 -10 19 

Various glasses 

10 10 -10 14 

Deionized water 

~10' 6 

Sea water 

5 

Silicon 77 -doped to 10 16 cm'' 

2.5xl0 2 

Silicon 77 -doped to 10 19 cm _1 

1.6xl0 4 * 

Silicone-doped to 10 19 cm'' 

l.lxlO 4 

Nichrome (alloy 80% Ni + 20% Cr) 

0.9xl0 6 

Aluminum 

3.8xl0 7 

Copper 

6.0xl0 7 

Zinc crystal along a-axis 

1.65xl0 7 

Zinc crystal along c-axis 

1.72xl0 7 


3 In SI units, the conductivity is measured in siemens per meter, where one siemens (S) is the reciprocal of one 

ohm: 1 S = (1 Q)" 1 = 1 A / 1 V. The constant reciprocal to conductivity, 1/cr, is called resistivity, and is commonly 

denoted by letter p. I will, however, try to avoid using this notion, because I am already overusing this letter. 
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In order to get some feeling what do these values mean, let us consider a very simple system 
(Fig. 4): a plane capacitor of area A » dT, filled with a material that has not only a dielectric constant £,., 
but also some Ohmic conductivity a, with much more conductive plate electrodes. 




(j) = V 

+ e 


j- E . 

| 

□ 

r 

(j) = 0 



-q 



Fig. 4.4. “Leaky” plane capacitor. 


Assuming that these properties are compatible with each other, 4 we may assume that the 
distribution of electric potential (not too close to the capacitor edges) still obeys Eq. (2.39), so that the 
electric field is vertical and uniform, with E = Vld. Then, according to Eq. (6) the current density is also 
uniform,/ = oE = oVId. From here, the total current between the plates is 

I = jA = aEA = a — A. (4.9) 

d 


On the other hand, from Eqs. (2.26) and (3.45), the instant value of plate charge is Q = C m V = 
(s r SoA/d)V. Plugging these relations into Eq. (1), we see that the speed of charge (and voltage) 
relaxation does not depend on the geometric parameters A and d\ 


dV V £,.£ 0 

j ~ ’ T r ~ ’ 

dt z r <j 


(4.10) 


where parameter x r has the sense of the relaxation time constant. As we know (see Table 3.1), for most 
practical materials the dielectric constant is within one order of magnitude from 10, so that the 
nominator of Eq. (10) is of the order of 10" 10 . As a result, according to Table 1, the charge relaxation 
time ranges from ~10 14 s (more than a million years!) for best insulators like teflon, to ~ 1 0 lx s for the 
least resistive metals. 


What is the physics behind these values of cr and why, for some materials, Table 1 gives them 
with such a large uncertainty? If charge carriers move as classical particles (e.g., in plasmas or non- 
degenerate semiconductors), a reasonable description of conductivity is given by the famous Drude 
formula. 5 In his picture, due to weak electric field, the charge carriers are accelerated in its direction 
(possibly on the top of their random motion in all directions, i.e. with a vanishing average velocity 
vector): 


— = -^-E 
dt m 


and as a result their velocity acquires an the average value 



d\ _ q 
dt m 


E x , 


(4.11) 


(4.12) 


4 As will be discussed in Chapter 6, such simple analysis is only valid if a is not too high. 

5 It was suggested by P. Drude in 1900. 
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where the phenomenological parameter z = l/v (not to be confused with z>!) may be understood as the 
effective average time between carrier scattering events. From here, the current density: 


j = qn{y) 


q 2 nr 

m 


E, i.e.cr = 


qnr 

m 


(4.13a) 


(Notice the independence of a of the carrier charge sign.) Another form of the same result, more popular 
in the physics of semiconductors, is 


<j = qnp. 


with p = 


r 

in ’ 


(4.13b) 


Two 

versions 
of the 
Drude 
formula 


where parameter p, defined by relation (v) = pE, is called the charge carrier mobility. 


Most good conductors (e.g., metals) are essentially degenerate Fermi gases (or liquids), in which 
the average thermal energy of a particle, k\>J is much lower that the Fermi energy S\ . In this case, a 
quantum theory is needed for the calculation of cr. Such theory was developed by the quantum physics’ 
godfather A. Sommerfeld in 1927 (and is sometimes called the Drude-Sommerfeld model). I have no 
time to discuss it in this course, 6 and here I will only notice that for an ideal, isotropic Fenni gas the 
result is reduced to Eq. (13), with a certain effective value of r, so it may be used for estimates of cr, 
with due respect to the quantum theory of scattering. In a typical metal, n is very high (~10 23 cm' 3 ) and 
is fixed by the atomic structure, so that the sample quality may only affect cr via the scattering time r. 


At room temperature, the scattering of electrons by thermally-excited lattice vibrations 
( phonons ) dominates, so that z and cr are high but finite, and do not change much from one sample to 
another. (Hence, the more accurate values given for metals in Table 1.) On the other hand, at T — > 0, a 
perfect crystal should not exhibit scattering at all, and conductivity should be infinite. In practice, this is 
never true (for example, due to electron scattering from imperfect boundaries of finite-size samples), 
and the effective conductivity cr is infinite (or practically infinite, at least above the measurable value 
~10“ S/m) only in superconductors. 7 


On the other hand, the conductivity of quasi-insulators (including deionized water) and 
semiconductors depends mostly of the carrier density n that is much lower than in metals. From the 
point of view of quantum mechanics, this happens because the ground-state eigenenergies of charge 
carriers are localized within an atom (or molecule), and separated from excited states, with space- 
extended wavefunctions, by a large energy gap (called bandgap ). For example, in SiO? the bandgap 
approaches 9 eV, equivalent to -4,000 K. This is why, even at room temperatures the density of 
thermally-excited free charge carriers in good insulators is negligible. In these materials, n is determined 
by impurities and vacancies, and may depend on a particular chemical synthesis or other fabrication 
technology, rather than on fundamental properties of the material. (On the contrary, the carrier mobility 
// in these materials is almost technology-independent.) 


The practical importance of the technology may be illustrated on the following example. In cells 
of the so-called floating-gate memories, in particular the flash memories, which currently dominate the 
nonvolatile digital memory technology, data bits are stored as small electric charges ( Q ~ 10' 16 C) of 


6 For such a discussion see, e.g., SM Sec. 6.3. 

7 Electrodynamic properties of superconductors are so interesting (and important) that I will discuss them in more 
detail in Chapter 6. 
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highly doped silicon islands (so-called floating gates) separated from the rest of the integrated circuit 
with a ~10-nm- thick layer of silicon dioxide, SiC> 2 . Such layers are fabricated by high- temperature 
oxidation of virtually perfect silicon crystals. The conductivity of the resulting high-quality (though 
amorphous) material is so low, a ~ 10" 19 S/m, that the relaxation time z>, defined by Eq. (10), is well 
above 10 years - the industrial standard for data retention in non-volatile memories. In order to 
appreciate how good this technology is, the cited value should be compared with the typical 
conductivity cr~ 10' 16 S/m of the usual, bulk SiC >2 ceramics. 8 


4.3. Boundary problems 

For an Ohmic conducting media, we may combine Eqs. (6) and (8) the following differential 
equation 

V • (cr V^) = 0 . (4.14) 


For a unifonn conductor (cr = const), Eq. (14) is reduced to the Laplace equation for the electrostatic 
potential (f>. As we already know from Chapters 2 and 3, its solution depends on the boundary 
conditions. These conditions depend on the interface type. 


(i) Conductor-conductor interface. Applying the continuity equation (6) to a Gauss-type pillbox 
at the interface of two different conductors (Fig. 5), we get 

(jn)l = (jnh, (4-15) 

so that if the Ohm law is valid inside each medium, then 


O', 


df_ 

on 


= cr , 


d<f> 2 

dn 


(4.16) 



Fig. 4.5. DC current “refraction” at the interface between two 
different conductors. 


Also, since the electric field should be finite, its potential (j> has to be continuous across the 
interface - the condition that may also be written as 


8 Unfortunately, these notes are not an appropriate platform to discuss details of the floating-gate memory 
technology. However, I think that every educated physicist should know its basics, because such memories are 
presently the driver of all semiconductor integrated circuit technology development, and hence of the whole 
information technology progress. Perhaps the best available book is J. Brewer and M. Gill (eds.), Nonvolatile 
Memory > Technologies with Emphasis on Flash , IEEE, 2008. 
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d(/) x d(/) 2 

dz dz 


(4.17) 


Both these conditions (and hence the solutions of the boundary problems using them) are similar to 
those for the interface between two dielectrics - cf. Eqs. (3.46)-(3.47). 


Note that using the Ohm law, Eq. (17) may be rewritten as 

— Or), =— O r ) 2 . 


(4.18) 


cr, 


CG 


Comparing it with Eq. (15) we see that, generally, the current density magnitude changes at the 
interface: j\ ^ ji. It is also curious that if cri ^ cr 2 , the current line slope changes at the interface (Fig. 4), 
qualitatively to the refraction of light rays in optics - see Chapter 7. 

(ii) Conductor-electrode interface. The definition of an electrode, or a “perfect conductor”, is a 
medium with cr — » oo. Then, at fixed current density at the interface, the electric field in the electrode 
tends to zero, and hence it may be described by equation 

(j) = (j) j = const, (4.19) 


where constants (f)j may be different for different electrodes (numbered with index /). Note that with 
such boundary conditions the Laplace boundary problem becomes exactly the same as in electrostatics - 
see Eq. (2.35) - and hence we can use all the methods (and some solutions :-) of Chapter 2 for finding 
dc current distribution. 


(iii) Conductor- insulator interface. For the description of an insulator, we can use cr = 0, so that 
Eq. (16) yields the following boundary condition, 


d(j) 

dn 


= 0 , 


(4.20) 


for the potential derivative inside the conductor. From the Ohm law we see that this is just the very 
natural requirement for the dc current not to flow into an insulator. 


Now, note that this condition makes the Laplace problem inside the conductor completely well- 
defined, and independent on the potential distribution in the adjacent insulator. On the contrary, due to 
the continuity of the electrostatic potential at the border, its distribution in the insulator has to follow 
that inside the conductor. Let us discuss this conceptual issue on the following (apparently, trivial) 
example: dc current in a long wire with a constant cross-section area A. The reader certainly knows the 
answer: 



where R = — 
/ 


<jA ’ 


(4.21) 


where / is the wire length, and constant R is called the resistance , 9 However, let us get this result 
formally from our theoretical framework. For the ideal geometry shown in Fig. 6a, this is easy to do. 
Here the potential evidently has a linear ID distribution 


9 The first of Eqs. (21) is essentially the integral form of the Ohm law (8), and is valid not only for a uniform 
wire, but for any Ohmic conductor with a geometry in which / and V may be clearly defined. 


Uniform 

wire’s 

resistance 
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^ = const -yF, (4.22) 

both in the conductor and the surrounding free space, with both boundary conditions (16) and (17) 
satisfied at the conductor-insulator interfaces, and condition (20) satisfied at the conductor-electrode 
interfaces. As a result, the electric field is constant and has only one component E x = VII, so that inside 
the conductor 


./, = oE x , I = j x A , 


(4.23) 


giving us the well-known Eq. (21). 


</> = V ^ = 0 



(j) -0 



(b) 


Fig. 4.6. (a) Trivial and (b) 
not-so-trivial problems of the 
field distribution at dc current 
flow. (For the latter case, 
schematically.) 


However, what about the geometry shown in Fig. 6b? In this case the field distribution in the 
insulator is dramatically different, but according to boundary problem defined by Eqs. (14) and (20), 
inside the conductor the solution is exactly the same as it was in the former case. Now, the Laplace 
equation in the surrounding insulator has to be solved with the boundary values of the electrostatic 
potential, “dictated” by the distribution of the current (and hence potential) in the conductor. 

Let us solve a problem in that this conduction hierarchy may be followed analytically to the very 
end. Consider an empty spherical cavity cut in a conductor with an initially uniform current flow with 
constant density j 0 = n_-/o (Fig. 7a). Following the conduction hierarchy, we have to solve the boundary 
problem in the conducting part of the system, i.e. outside the sphere (r > R ), first. Since the problem is 
evidently axially- symmetric, we already know the general solution of the Laplace equation - see Eq. 
(2.172). Moreover, we know that in order to match the uniform field at r — > oo , all coefficients a/ but 
one (a i = - Eq = - jo/ 6) have to be zero, and that the boundary conditions at r = R will give zero solutions 
for all coefficients bi but one (hi), so that 

</> = - — r cos6 + \cos6, for r>R.. (4.24) 

u r 

In order to find coefficient hi, we have to use the boundary condition (20) at r = R: 
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d(j) 

dr 


r=R 



\ 


2 fy 
R 3 


cos# = 0 . 


This gives b\ = -joR^/la, so that, finally, 




<J 


r + - 


R 

2 r' 


3 A 


cos # . 


J 


(4.25) 


(4.26) 


3 

(Note that this potential distribution corresponds to the dipole moment p = -Eoi? / 2. It is easy to check 
that if the empty sphere was cut in a dielectric, the potential distribution outside the cavity would be 

■5 

similar, with p = -E 0 R {s r - 1 )!{s r + 2). In the limit s r — » qo, these two results coincide, despite the rather 
different type of the problem: in the dielectric case, there is no current at all.) 



Fig. 4.7. Spherical cavity in a uniform conductor: (a) the problem’s geometry, and (b) the equipotential 
surfaces, as given by Eq. (40) for r > R and Eq. (42) for r < R. 


Now, as the second step in the conductivity hierarchy, we may find the electrostatic potential 
distribution $r,#) in the insulator, in this particular case inside the cavity (r < R). It should also satisfy 
the Laplace equation with the boundary conditions at r = R, “dictated” by distribution (26): 

</>(R,0) = --^Rcos0 . (4.27) 

2 cr 

We could again solve this problem by the formal variable separation (keeping in the general solution 
(2.172) only the term proportional to b\, that does not diverge at r — > 0), but if we notice that boundary 
condition (27) depends on just one Cartesian coordinate, z = RcosO, the solution may be just guessed: 

</){r,ff) = - — — z = - — j-^-r cos#, at r<R. (4.28) 

2 cr 2 cr 

It evidently satisfies the Laplace equation and the boundary condition (27), and corresponds to a 
constant vertical electric field equal to 3y 0 /2cr- see Fig. 6b. 

The conductivity hierarchy says that static electrical fields and charges outside conductors (e.g., 
electric wires) do not affect currents flowing in the wires, and it is physically clear why. For example, if 
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a charge in vacuum is slowly moved close to a wire, it (in accordance with the linear superposition 
principle) will only induce an additional surface charge (see Chapter 2) that screens the external 
charge’s field, without participating in (or disturbing) the current flow inside the conductor. 

Besides the conceptual discussion, the two examples given above may be considered as a 
demonstration of the application of the first two methods described in Chapter 2 (the orthogonal 
coordinates (Fig. 5) and variable separation (Fig. 6)) to dc current distribution problems. Continuing this 
review of the methods we know, let us discuss the analog of the method of charge images. Let us 
consider the spherically-symmetric potential distribution of the electrostatic potential, similar to that 
given by Eq. (1.35): 


</> = -. (4.29) 

r 

As we know from Chapter 1, this is a particular solution of the 3D Laplace equation at all points but r = 
0, and hence is a legitimate solution in a current-carrying conductor as well. In vacuum, this distribution 
would correspond to a point charge q = 4/rsoc; but what about the conductor? Calculating the 
corresponding electric field and current density, 

E = -V (f) = -jr, j = oE = cr -yi\ (4.30) 

r r 

we see that the total current flowing from the point in the origin through a sphere of an arbitrary radius r 
does not depend on the radius: 

/ = Aj = 4n r 2 j = 4 tt(T c. (4.31) 

Plugging the resulting c into Eq. (29), we get 

</> = - —■ (4-32) 

4nor 

Hence the Coulomb-type distribution of the electric potential in a conductor is possible (at least 
at some distance from the singular point r = 0), and describes dc current / flowing out of a small-size 
electrode - or into such a point, if coefficient c is negative. Such current injection may be readily 
implemented experimentally; think for example about an insulated wire with a small bare end, inserted 
into a poorly conducting soil - an important method in geophysical research. 10 

Now let the injection point r’ be close to a plane interface between the conductor and an 
insulator (Fig. 8). In this case, besides the Laplace equation, we should satisfy the boundary condition, 

j, =(£,=-< t(t = 0. (4.33) 

on 

It is clear that this can be done by replacing the insulator for a conductor with an additional 
current injection point, at the mirror image point r”. Note, however, that in contrast to the charge 
images, the sign of the imaginary current has to be similar, not opposite, to the initial one, so that the 
total electrostatic potential inside the conducting semi-space is 


10 Such situations are even more natural in 2D situations, for example, think about a wire soldered, in a small spot, 
to a thin metallic foil. (Note that here the current density distribution law is different, j oc Hr rather than 1/r 2 .) 
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<p{r) 




1 1 
r + 7 


r-r r-r 


(4.34) 


(Note that the image current’s sign would be opposite if we discussed an interface between a conductor 
with a moderate conductivity and a perfect conductor (“electrode”) whose potential should be virtually 
constant.) 



This result may be readily used, for example, to calculate the current density at the conductor’s 
surface, as a function of distance p from point 0 (the surface point closest to the current injection) - see 
Fig. 8. At the surface, Eq. (34) yields 


(j) = 


I 1 

(p 2 +d 2 }' 2 ’ 


(4.35) 


so that the current density is independent of cr. 

. _ ^ _ d</> _ I p 

1 >- a >—°d P - 


(4.36) 


Deviations from Eqs. (35) and (36), which are valid for a unifonn medium, may be used to find 
and characterize conductance inhomogeneities, say, those due to mineral deposits in the Earth crust. 11 


4.4. Dissipation power 

Let me conclude this brief chapter with an ultra-short discussion of energy dissipation in 
conductors. In contrast to the electrostatics situations in insulators (vacuum or dielectrics), at dc 
conduction the electrostatic energy U is “dissipated” (i.e. transferred to heat) at a certain rate *P=- 
dU/dt, called dissipation power. 12 This rate may evaluated by calculating the power of electric field’s 
work on a single moving charge: 


11 In practice, the current injection may be produced, due to electrochemical reactions, by an ore mass itself, so 
that one need only measure (and inteipret :-) the resulting potential distribution - the so-called self-potential 
method - see, e.g., Sec. 6.1 in monograph by W. Telford et al.. Applied Geophysics, 2 nd ed., Cambridge U. Press, 
1990. 

12 Since the electric field and hence the electrostatic energy are time-independent, this means that the energy is 
replenished at the same rate from the current source(s). 


Chapter 4 


Page 12 of 14 





Essential Graduate Physics 


EM: Classical Electrodynamics 


General 

Joule 

law 


Joule law 
for Ohmic 
conductivity 


= F • v = qE ■ v . (4.37) 

After the summation over all charges, Eq. (37) gives us the dissipation power. If the charge 
density n is uniform, multiplying by it both parts of this equation, and taking into account that qn\ = j, 
for the power dissipated in a unit volume we get the Joule law 



-P.N ^ 

= 73 n 

V 1 


= qE • v/7 = E • j . 


(4.38) 


In the particular case of the Ohmic conductivity, this expression may be also rewritten in two 
other forms: 


r = oE 2 



(4.39) 


At dc conduction, the energy is permanently replenished by a flow of power from the current source(s). 


With our electrostatics background, it is straightforward (and hence left for reader’s exercise) to 
prove that the dc current distribution in a uniform Ohmic conductor, at a fixed voltage applied at its 
borders, corresponds to the minimum of the total dissipation power 

H = a\E 2 d i r. (4.40) 

v 


4,5. Exercise problems 

4.1 . Find the resistance between two large conductors separated 
with a very thin, plane, insulating partition, with a circular hole of 
radius i? in it - see Fig. on the right. 

Hint : You may like to use the degenerate ellipsoidal coordinates 
that had been used in Sec. 2.4 to find the self-capacitance of a round 
disk in vacuum. 


4,2 . Calculate the effective (average) conductivity cr e f of a 
medium with many empty spherical cavities of radius R , carved at 
random points in a uniform Ohmic conductor (see Fig. on the right), in 

"X 

the limit of low density n«R~ of the spheres. 

Hint Try to use the analogy with a dipole media (Sec. 3.2). 


4,3 . In two separate experiments, a narrow gap, of irregular width, between two close metallic 
electrodes is filled with some material - in the first case, with a uniform linear insulator with an electric 
permittivity s, and in the second case, with a uniform conducting material with an Ohmic conductivity 
cr. Neglecting the fringe effects, calculate the relation between the mutual capacitance C between the 
electrodes (in the first case) and the dc resistance R between then (in the second case). 
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4.4 . Calculate the voltage drop V across a uniform, wide 
resistive slab of thickness t, at distance / from the points of 
injection/ejection of dc current / that is passed across the slab - see 
Fig. on the right. 

Hint : Try to use the dc current analog of the charge image 
method. 


4.5 . Find the voltage drop V between two corners of a square 
cut from a uniform, resistive sheet of thickness t, induced by dc 
current / that is passed between its two other corners - see Fig. on the 
right. 



4.6 . Calculate the distribution of dc current density in a thin, round, unifonn resistive disk, if the 
current is inserted into a point at its rim, and picked up at the center. 


4,7 .* The simplest model of a vacuum diode consists of two plane, parallel metallic electrodes of 
area A, separated by a gap of thickness d « A : a “cathode” which emits electrons to vacuum, and an 
“anode” which absorbs the electrons arriving at its surface. Calculate the dc I-V curve of the diode, i.e. 
the stationary relation between current / flowing between the electrodes and voltage V applied between 
them, using the following simplifying assumptions: 

(i) due to the effect of the negative space charge of the emitted electrons, current / is much 
smaller than the emission ability of the cathode, 

(ii) the initial velocity of the emitted electrons is negligible, and 

(iii) the direct Coulomb interaction of electrons (besides the space charge effect) is negligible. 


4.8 .* Calculate the space-charge-limited current in a system with the same geometry, and using 
the same assumptions as in the previous problem, besides assuming now that the emitted charge carriers 
move not ballistically, but in accordance with the Ohm law, with the conductivity given by Eq. (4.13): a 
= q 2 fin, with constant mobility //. 

Hint: In order to get a realistic result, assume that the medium in which the carriers move 13 has a 
certain dielectric constant e r . 


4.9 . Prove that the distribution of dc currents in a unifonn Ohmic conductor, at fixed voltage 
applied at its boundaries, corresponds to the minimum of the total power dissipation (“Joule heat”). 


13 As was mentioned in Sec. 4.2 of the lecture notes, the assumption of constant (charge-density-independent) 
mobility is most suitable for semiconductors. 
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Chapter 5. Magnetism 

Despite the fact that we are now starting to discuss a completely new type of electromagnetic 
interactions, its coverage (for the stationary case) will take just one chapter, because we will be able to 
recycle many ideas and methods of electrostatics, though with a twist or two. 


5.1. Magnetic interaction of currents 

DC currents in conductors usually leave them electroneutral, p( r) = 0, with a very good 
precision, because any virtual misbalance of positive and negative charge density results in extremely 
strong Coulomb forces that restore their balance by an additional shift of free carriers. 1 This is why let 
us start the discussion of magnetic interactions from the simplest case of two spatially-separated, 
current-carrying, electroneutral conductors (Fig. 1). 



Fig. 5.1. Magnetic interaction of two 
currents. 


Magnetic 

force 

between 

currents 


According to the Coulomb law, there should be no force between them. However, several 
experiments carried out in the early 1820s 2 proved that such non-Coulomb forces do exist, and are the 
manifestation of another, magnetic interactions between the currents. In the contemporary used in this 
course, their results may be summarized with one formula, in SI units expressed as: 3 


F = 


4 n\, f r-r' 


(5.1) 


Here coefficient pfAn (where po is called either the magnetic constant or the free space permeability), 
by definition, equals exactly 1 0‘ 7 SI units, thus relating the electric current (and hence electric charge) 
definition to that of force - see below. 


Note that the Coulomb law (1.1), with the account of the linear superposition principle, may be 
presented in a very similar form: 


1 The most important case when the electroneutrality does not hold is the motion of electrons in vacuum. In this 
case, magnetic forces coexist with (typically, stronger) electrostatic forces - see Eq. (3) below and its discussion. 
In some semiconductor devices, local violations of electroneutrality also play an important role. 

2 Most notably, by H. C. 0rsted, J.-B. Biot and F. Savart, and A.-M. Ampere. 

3 In the Gaussian units, coefficient //o/ 4 zr is replaced with 1/c 2 (i.e., implicitly with juqSq) where c is the speed of 
light, in modem metrology considered exactly known - see, e.g., appendix CA: Selected Physical Constants. 
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F = 


~r ~ — f d^r f d 3 r' p{r)p\r') * * 

4 7TF J J „ 

y v' r — r 


(5.2) 


Besides the different coefficient and sign, the “only” difference of Eq. (1) from Eq. (2) is the scalar 
product of current densities, evidently necessary because of the vector character of the current density. 
We will see that this difference will bring certain complications in applying the electrostatics 
approaches, discussed in the previous chapters, to magnetostatics. 

Before going to their discussion, let us have one more glance at the coefficients in Eqs. (1) and 
(2). To compare them, let us consider two objects with uncompensated charge distributions p( r) and 
p'(r), each moving parallel to each other as a whole certain velocities v and v’, as measured in an 
inertial “lab” frame. In this case, j(r) = p( r )v, j(r)-j ’(r) = p{r)p\r)vv ’, and the integrals in Eqs. (1) and 
(2) become functionally similar, and differ only by the factor 


F 


magnetic 


F 


electric 


Fp W' ; 1 

4 n 4 



(5.3) 


(This expression hold in any consistent system of units.) We immediately see that magnetism is an 
essentially relativistic phenomenon, very weak in comparison with the electrostatic interaction at the 
human scale velocities, v « c, and may dominate only if the latter interaction vanishes - as it does in 
electroneutral systems. 4 

Also, Eq. (3) points at an interesting paradox. Consider two electron beams moving parallel to 
each other, with the same velocity v with respect to a lab reference frame. Then, according to Eq. (3), 
the net force of their total (electric and magnetic) interaction is proportional to ( 1 - v /c ), and tends to 
zero in the limit v — > c. However, in the reference frame moving together with electrons, they are not 
moving at all, i.e. v = 0. Hence, from the point of view of such a moving observer, the electron beams 
should interact only electrostatically, with a repulsive force independent of velocity v. Historically, this 
had been one of several paradoxes that led to the development of the special relativity; its resolution will 
be discussed in Chapter 9, devoted to this theory. 

Returning to Eq. (1), in some simple cases, the double integration in it may be carried out 
analytically. First of all, let us simplify this expression for the case of two thin, long conductors (wires) 
separated by a distance much larger than their thickness. In this case we may integrate the products j ctr 
and j’d r’ over wires’ cross-sections first, neglecting the corresponding change of (r - r’). Since the 
integrals of the current density over the cross-sections of the wire are just the currents / and F in the 
wires, and cannot change along their lengths (correspondingly, / and /’), they may be taken out of the 
remaining integrals, reducing Eq. (1) to 


F = 


Fo IT 

4 n 




(5.4) 


4 The discovery and initial studies of such a subtle, relativistic phenomenon as magnetism in the early 19th 
century was much facilitated by the relative abundance of natural ferromagnets, materials with spontaneous 
magnetic polarization, whose strong magnetic field may be traced back to relativistic effects (such as spin) in 
atoms. (The electrostatic analogs of such materials, electrets, are much more rare.) I will briefly discuss the 
ferromagnetism in Sec. 5 below. 
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Lorentz 
force on 
a current 


As the simplest example, consider two straight, parallel wires (Fig. 2), separated by distance d, 
with length / » p. In this case, due to symmetry, the vector of magnetic interaction force has to: 

(i) lay in the same plane as the currents, and 

(ii) be perpendicular to the wires - see Fig. 2. 

Hence we can limit our calculations to just one component of the force. Using the fact that with the 
coordinate choice shown in Fig. 2, dr-dr ' = dxdx we get 


F = — 


An 


f +00 +00 

- | dx | dx' 


sin# 


d~ + (x -x’) 


t\2 


Mo 11 ' 

An 


f +00 +00 

- Jjx J dx’ 


d 


[d 2 +(x-x') 2 f /2 


(5.5) 


Introducing, instead of x', a new, dimensionless variable £ = (x - x’)/p, we may reduce the internal 
integral to a table integral which we have already met in this course: 


F = 


M 0 W 

And 



Ao IT 

2 n d 



(5.6) 


The integral over x is formally diverging, but this means merely that the interaction force per unit length 
of the wires is constant: 


F ... Mo £_ 
l 2nd 


(5.7) 


Note that the force drops rather slowly (only as 1 Id) as the distance d between the wires is increased, 
and is attractive (rather than repulsive as in the Coulomb law) if the currents are of the same sign. 



Fig. 5.2. Magnetic force between two 
straight parallel currents. 


This is an important result, 5 but again, the problems solvable so simply are few and far between, 
and it is intuitively clear that we would strongly benefit from the same approach as in electrostatics, i.e., 
from breaking Eq. (1) into a product of two factors via the introduction of a suitable field. Such 
decomposition may done as follows: 



(5.8) 


5 In particular, Eq. (7) is used for the legal definition of the SI unit of current, one ampere (A), via the SI unit of 
force (the newton, N), with coefficient p 0 fixed as listed above. 
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where vector B is called the magnetic field (in our particular case, induced by current j ’): f 



(5.9) 


The last equation is called the Biot-Savart law, while F expressed by Eq. (8) is sometimes called the 
Lorentz force. 1 However, more frequently the later term is reserved for the full force, 


F = ^(E + vxB), 


(5.10) 


exerted by electric and magnetic fields field on a point charge q, moving with velocity v. (The 
equivalence of Eq. (8) and the magnetic part of Eq. (10) follows from the summation of all forces acting 
on n particles in a unit volume, moving with the same velocity v, so that j = qn\.) 


Now we have to prove that the new formulation (8)-(9) is equivalent to Eq. (1). At the first 
glance, this seems unlikely. Indeed, first of all, Eqs. (8) and (9) involve vector products, while Eq. (1) is 
based on a scalar product. More profoundly, in contrast to Eq. (1), Eqs. (8) and (9) do not satisfy the 3 ld 
Newton’s law, applied to elementary current components jr/V and j V/V ’, if these vectors are not parallel 
to each other. Indeed, consider the situation shown in Fig. 3. Here vector j ’ is perpendicular to vector (r 
- r ’), and hence, according to Eq. (9), produces a nonvanishing contribution dB ’ to the magnetic field, 
directed (in Fig. 3) perpendicular to the plane of drawing, i.e. is perpendicular to vector j. Hence, 
according to Eq. (8), this field provides a nonvanishing contribution to F. On the other hand, if we 
calculate the reciprocal force F ’ by swapping indices in Eqs. (8) and (9), the latter equation immediately 
shows that r/B(r’) <x jx(r - r’) = 0, because the two operand vectors are parallel (Fig. 3). Hence, the 
current component j V/V does exert a force on its counterpart, while jr/ 3 r does not. 


— 

V JB'^0 

i r dF ^ 0 


r/B' = 0, r 
</F' = 0 


j tfV 


Fig. 5.3. Apparent violation of the 3 ld 
Newton law in magnetism. 


Despite this apparent problem, let us still go ahead and plug Eq. (9) into Eq. (8): 


F = 




Jc/VjdV j(r)x 


V V' 


( 

j'(r')x 

V 



(5.11) 


6 The SI unit of the magnetic field is called tesla, T - after N. Tesla, an electrical engineering pioneer. In the 
Gaussian units, the already discussed constant 1/c 2 in Eq. (1) is equally divided between Eqs. (8) and (9), so that 
in them both, the constant before the integral is 1/c. The resulting Gaussian unit of field B is called gauss (G); 
taking into account the difference of units of electric charge and length, and hence current density, 1 G equals 
exactly 10 4 T. Note also that in some textbooks, especially old ones, B is called either the magnetic induction, or 
the magnetic flux density, while the term “magnetic field” is reserved for vector H that will be introduced Sec. 5 
below. 

Named after H. Lorentz, who received a Nobel prize for his explanation of the Zeeman effect, but is 
more famous for his numerous contributions to the development of special relativity - see Chapter 9. To 
be fair, the magnetic part of the Lorentz force was correctly calculated first by O. Heaviside. 
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This double vector product may transformed into two scalar products, using the vector algebraic identity 
called the bac minus cab rule, ax(bxc) = b(a-c) - c(a-b ). 8 Applying this relation, with a = j, b = j ’, and c 
= R = r - r ’, to Eq. ( 1 1), we get 


F = ^UVj'(d \d 

An J J 


,3 r j(r)-R 


W 


R 


^Id’rjdVm-iXr)^ . 


(5.12) 


The second term in the right-hand part of this equation coincides with the right-hand part of Eq. (1), 
while the first tenn equals zero, because its the internal integral vanishes. Indeed, we may break 
volumes V and V into narrow current tubes, the stretched sub-volumes whose walls are not crossed by 
current lines (j n = 0). As a result, the (infinitesimal) current in each tube, dl =jdA = jd 2 r, is the same 
along its length, and, just as in a thin wire, j dr may be replaced with d/dr. Because of this, each tube’s 
contribution to the internal integral in the first term of Eq. (12) may be presented as 



(5.13) 


where operator V acts in the r space, and the integral is taken along tube’s length /. Due to the current 
continuity, each loop should follow a closed contour, and an integral of a full differential of some scalar 
function (in our case, l/r^) along it equals zero. 

So we have recovered Eq. (1). Returning for a minute to the paradox illustrated with Fig. 3, we 
may conclude that the apparent violation of the 3 rd Newton law was the artifact of our interpretation of 
Eqs. (8) and (9) as sums of independent elementary components. In reality, due to the dc current 
continuity expressed by Eq. (4.6), these components are not independent. For the whole currents, Eqs. 
(8)-(9) do obey the 3 rd law - as follows from their already proved equivalence to Eq. (1). 

Thus we have been able to break the magnetic interaction into the two effects: the creation of the 
magnetic field B by one current (in our notation, j ’), and the effect of this field on the other current (j). 
Now comes an additional experimental fact: other elementary components j dr’ of current j also 
contribute to the magnetic field (9) acting on component j d 3 r. 9 This fact allows us to drop prime after j 
in Eq. (9), and rewrite Eqs. (8) and (9) as 

B(r) = T 1 f K r ') x r— > (5.14) 

4 n\, |r -r'| 

F = {j( r ) xB ( r ¥ 3? % (5-15) 

V 

Again, the field observation point r and the field source point r ’ have to be clearly distinguished. We 
immediately see that these expressions are similar to, but still different from the corresponding relations 
of the electrostatics, namely Eq. (1.8), 


8 See, e.g., MA Eq. (7.5). 

9 Just in electrostatics, one needs to exercise due caution at transfer from these expressions to the limit of discrete 
classical particles, and extended wavefunctions in quantum mechanics, in order to avoid the (non-existing) 
magnetic interaction of a charged particle upon itself. 
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E(r) = 


A ns. 


-|p(r')- 

o v 


-d 3 r'. 


and the distributed version of Eq. (1.6): 

F = j>/?(r)E(r)f/V . 

V 


(5.16) 


(5.17) 


(Note that the sign difference has disappeared, at the cost of the replacement of scalar-by-vector 
multiplications in electrostatics with cross-products of vectors in magnetostatics.) 

For the frequent case of a field of a thin wire of length /’, Eq. (14) may be re-written as 

B(r) = ^ljr'x r ~ r ' . (5.18) 

An j, r-r' 


Let us see how does the last formula work for the simplest case of a straight wire (Fig. 4a). The 
magnetic field contribution r/B due to any small fragment dr’ of the wire’s length is directed along the 
same line (perpendicular to both the wire and the perpendicular d dropped from the observation point to 
the wire line), and its magnitude is 


m = dx' ing = /yf 


d 


An |r-r'| 4;r 

Summing up all such contributions, we get 

//„ Ip r dx 


Rw)(</wr' 


B = 


An Ux l +d l ) 


2 \ 3/2 


2nd 


(5.19) 


(5.20) 



Fig. 5.4. Magnetic fields of: (a) a straight current, and (b) a current loop. 

This is a simple but very important result. (Note that it is only valid for very long (/ » d), 
straight wires.) It is especially crucial to note the “vortex” character of the field: its lines go around the 
wire, forming round rings with the centers on the current line. This is in the sharp contrast to the 
electrostatic field lines that can only begin and end on electric charges and never form closed loops 
(otherwise the Coulomb force qE would not be conservative). In the magnetic case, the vortex field may 
be reconciled with the potential character of magnetic forces, which is evident from Eq. (1), due to the 
vector products in Eqs. (14)-(15). 
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Now we may use Eq. (15), or rather its thin- wire version 

F = /j> dr xB(r) , 

i 


(5.21) 


to apply Eq. (20) to the two-wire problem (Fig. 2). Since for the second wire vectors dr and B are 
perpendicular to each other, we immediately arrive at our previous result (7). 

The next important application of the Biot-Savart law (14) is the magnetic field at the axis of a 
circular current loop (Fig. 4b). Due to the problem symmetry, the net field B has to be directed along the 
axis, but each of its components dB is tilted by angle 0 = arctan (z/R) to this axis, so that its axial 
component 


dB z 


= dB cos 0 = 


ju 0 l dr' R 

R 2 + z 2 (r 2 +z 2 ) V2 ' 


(5.22) 


Since the denominator of this expression remains the same for all wire components dr’, in this case the 
integration is trivial (jdr ’ = 2nR), giving finally 


2 (^ + z 2 ) 3/2 


(5.23) 


Note that the magnetic field in the loop’s center (i.e., for z = 0), 


B = 


2 R ’ 


(5.24) 


is n times higher than that due to a similar current in a straight wire, at distance d = R from it. This 
increase it readily understandable, since all elementary components of the loop are at the same distance 
R from the observation point, while in the case of a straight wire, all its point but one are separated from 
the observation point by a distance larger than d. 

Another notable fact is that at large distances (z 2 » R 2 ), field (23) is proportional to z' 3 : 


B 


ju 0 I R 2 _ jU 0 2m 


(5.25) 


just like the electric field of a dipole (along its direction), with the replacement of the electric dipole 
moment magnitude p with m = IA, where A = ttR 2 is the loop area. This is the best example of a 
magnetic dipole, with dipole moment m - the notions to be discussed in more detail in Sec. 5 below. 


5.2, Vector-potential and the Ampere law 

The reader can see that the calculations of the magnetic field using Eq. (14) or (18) are still 
cumbersome even for the very simple systems we have examined. As we saw in Chapter 1, similar 
calculations in electrostatics, at least for several important systems of high symmetry, could be 
substantially simplified using the Gauss law (1.16). A similar relation exists in magnetostatics as well, 
but has a different form, due to the vortex character of the magnetic field. To derive it, let us notice that 
in an analogy with the scalar case, the vector product under integral (14) may be transformed as 
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y(r')x(r-r') _ ^ ^ j(r') 


(5.26) 


where operator V acts in the r space. (This equality may be really verified by its Cartesian components, 
noticing that the current density is a function of r ’ and hence its components are independent of r.) 
Plugging Eq. (26) into Eq. (14), and moving operator V out of the integral over r’, we see that the 


magnetic field may be presented as the curl of another vector fie 


B(r) = V x A(r) , 


d: 10 


namely the so-called vector-potential. 



(5.27) 


(5.28) 


Please note a wonderful analogy between Eqs. (27)-(28) and, respectively, Eqs. (1.33) and (1.38). This 
analogy implies that vector-potential A plays, for the magnetic field, essentially the same role as the 
scalar potential <f) plays for the electric field (hence the name “potential”), with due respect to the vortex 
character of A. I will discuss this notion in detail below. 

Now let us see what equations we may get for the spatial derivatives of the magnetic field. First, 
vector algebra says that the divergence of any curl is zero. * 11 In application to Eq. (27), this means that 

V -B = 0 . (5.29) 


Comparing this equation with Eq. (1.27), we see that Eq. (29) may be interpreted as the absence of a 
magnetic analog of an electric charge on which magnetic field lines could originate or end. Numerous 
searches for such hypothetical magnetic charges, called magnetic monopoles, using very sensitive and 
sophisticated experimental setups, have never given a convincing evidence of their existence in Nature. 

Proceeding to the alternative, vector derivative of the magnetic field (i.e., its curl), and using Eq. 
(28), we get 


V x B(r) = — V x 
An 


VxJ 


j(r') 


d^r' 


This expression may be simplified by using the following general vector identity: 12 

V x (V x c) = V(V • c) - V 2 c , 

applied to vector c(r) = j(r ’)/|r - r ’|: 


VxB = ^ 
An 


v { j(r ' ) ' v | 1 ^H c/V "^:l j(r ' )v 


l 


-dr 


3i 


(5.30) 


(5.31) 


(5.32) 


As was already discussed during our study of electrostatics, 


10 In the Gaussian units, Eq. (27) remains the same, and hence in Eq. (28), coefficient // 0 /4;ris replaced with He. 

1 1 See, e.g., MA Eq. (11.2). 

12 See, e.g., MAEq. (11.3). 
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V 2 



= -4n8{x -r') , 


(5.33) 


so that the last term of Eq. (32) is just // 0 j(r). On the other hand, inside the first integral we can replace 
V with (-V ’), where prime means differentiation in the space of radius-vector r Integrating that term by 
parts, we get 

V x B = ~^-V§j n (i*')t ■ 7\d 2r ' + + <u 0 Kr) • (5.34) 

4 n J s , |r-r r | r_r | 

Applying this equation to the volume V’ limited by a surface S’ sufficiently distant from the field 
concentration (or with no current crossing it), we may neglect the first term in the right-hand part of Eq. 
(34), while the second term always equals zero in statics, due to the dc charge continuity - see Eq. (4.6). 
As a result, we arrive at a very simple differential equation 13 

V x B = ju () \ . (5.35) 


Static 

Maxwell 

equations 


This is (the dc fonn of) the inhomogeneous Maxwell equation, which in magnetostatics plays the 
role similar to the Poisson equation (1.27) in electrostatics. Let me display, for the first time in this 
course, this fundamental system of equations (at this stage, for statics only), and give the reader a minute 
to stare at their beautiful symmetry - that has inspired so much of the 20 th century physics: 


VxE = 0, 

V x B = /z 0 j. 

V • E = — , 

V • B = 0. 

^0 



(5.36) 


Their only asymmetry, two zeros in the right hand parts (for the magnetic field’s divergence and electric 
field’s curl), is due to the absence in Nature of, respectively, the magnetic monopoles and their currents. 
I will discuss these equations in more detail in Sec. 6.7, after the equations for field curls have been 
generalized to their full (time-dependent) versions. 


Ampere 

law 


Returning now to a more mundane but important task of calculating magnetic field induced by 
simple current configurations, we can benefit from an integral form of Eq. (35). For that, let us integrate 
this equation over an arbitrary surface S limited by a closed contour C, applying to it the Stokes 
theorem. 14 The resulting expression, 



(5.37) 


where / is the net electric current crossing surface S, is called the Ampere law. 


As the first example of its application, let us return to a current in a straight wire (Fig. 4a). With 
the Ampere law in our arsenal, we can readily pursue an even more ambitious goal - calculate the 
magnetic field both outside and inside of a wire of arbitrary radius R , with an arbitrary (albeit axially- 
symmetric) current distribution j(p) - see Fig. 5. Selecting two contours C in the fonn of rings of some 


13 As in all earlier formulas for the magnetic field, in the Gaussian units the coefficient // 0 in this relation has to be 
replaced with Ante. 

14 See, e.g., MA Eq. (12.1) with f = B. 
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radius p in the plane perpendicular to the wire axis z, we have Bc/r = Bp{dcp), these (p is the azimuthal 
angle, so that the Ampere law (37) yields: 


2 7i pB = p 0 x< 


2 K\j(p')p'dp', 

0 

R 

In \j(p')p'dp' = I , 
0 


for p< R, 


for p> R. 


(5.38) 


Thus we have not only recovered our previous result (20), with the notation replacement d — » p, in a 
much simpler way, but could also find the magnetic field distribution inside the wire. (In the most 
common case when the wire conductivity cr is constant, and hence the current is unifonnly distributed 
along its cross-section, j(p) = const, the first of Eqs. (38) immediately yields B oc p for p<R). 


Z A 



Fig. 5.5. The simplest application of the Ampere 
law: dc current in a straight wire. 


Another important example is a straight, long solenoid (Fig. 6a), with dense winding: n A » 1 , 
where n is the number of wire turns per unit length and A is the area of solenoid’s cross-section - not 
necessarily circular. 



From the symmetry of this problem, the longitudinal (in Fig. 6a, vertical) component B z of the 
magnetic field may only depend on the horizontal position p of the observation point. First taking a 
plane Ampere contour C\, with both long sides outside the solenoid, we get B z {pi) - /?-( p i ) = 0, because 
the total current piercing the contour equals zero. This is only possible if B z = 0 at any p outside of the 
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(infinitely long!) solenoid. 15 With this result on hand, from contour C 2 we get the following relation for 
the only (z-) component of the internal field: 

Bl = ju 0 NI , (5.39) 


where N is the number of wire turns passing through the contour of length /. This means that regardless 
of the exact position internal side of the contour, the result is the same: 

B = Bo y I = Mon! . (5.40) 


Thus, the field inside an infinitely long solenoid is unifonn; in this sense, a long solenoid is a magnetic 
analog of a wide plane capacitor. 

As should be clear from its derivation, the obtained result, especially that the field outside of the 
solenoid equals zero, is conditional on the solenoid length being very large in comparison with its lateral 
size. (From Eq. (25), we may predict that for a solenoid of a finite length /, the external field is only a 
factor of ~ A/l lower than the internal one.) Much better suppression of this external (“fringe”) field may 
be obtained using the toroidal solenoid (Fig. 6b). The application of Ampere law to this geometry shows 
that, in the limit of dense winding ( N » 1), there is no fringe field at all - for any relation between two 
radii of the thorus, while inside the solenoid, and distance p from the center, 


B _ Ao NI 
2np 


(5.41) 


We see that a possible drawback of this system for practical applications is that internal field depends on 
p, i.e. is not quite uniform; however, if the thorus is thin, this problem is minor. 

How should we solve the problems of magnetostatics for systems whose low symmetry does not 
allow getting easy results from the Ampere law? (The examples are of course too numerous to list; for 
example, we cannot use this approach even to reproduce Eq. (23) for a round current loop.) From the 
deep analogy with electrostatics, we may expect that in this case we could recover the field from the 
solution of a certain partial boundary problem for the field’s potential, in this case the vector-potential 
A defined by Eq. (28). However, despite the similarity of this fonnula and Eq. (1.38) for <f), that was 
emphasized above, there are two additional issues we should tackle in the magnetic case. 

First, finding vector-potential distribution means detennining three scalar functions (say, A x , A y , 
and A~), rather than one ((f)). Second, generally the differential equation satisfied by A is more complex 
than the Poisson equation for (f). Indeed, plugging Eq. (27) into Eq. (35), we get 

V x (V x A) = // 0 j . (5.42) 


If we wrote the left-hand part of this equation in (say, Cartesian) components, we would see that they 
are much more interwoven than in the Laplace operator, and hence much less convenient for using the 
orthogonal coordinate approach or the variable separation method. In order to remedy the situation, let 
us apply to Eq. (42) the now-familiar identity (3 1). The result is 


15 Applying the Ampere law to a circular contour of radius p, coaxial with the solenoid, we see that the field 
outside (but not inside!) it has an azimuthal component B, y similar to that of the straight wire (see Eq. (38) above) 
and hence (at N » 1) much weaker than the longitudinal field inside the solenoid - see Eq. (40). 
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V(V • A)- V 2 A = // 0 j . 


(5.43) 


We see that if we could kill the first term in the left-hand part, for example if V-A = 0, the second term 
would give us a set of independent Poisson equations for each Cartesian component of vector A. 

In this context, let us discuss what discretion do we have in the potential choice. In electrostatics, 
we might add to (f> not only an arbitrary constant, but also an arbitrary function of time, without 
affecting the electric field: 

-V[</> + f(t)\ = -V</> = E. (5.44) 

Similarly, using the fact that curl of the gradient of any scalar function equals zero, 16 we may add to A 
not only a constant, but even a gradient of an arbitrary function %(r, t), because 

V x (A + V %) = V x A + V x (Vj) = V x A = B . (5.45) 

Such additions, keeping the actual (observable) fields intact, are called gauge transformations , 17 Let us 
see what such a transformation does to V-A: 

V-(A + V/) = V-A + V 2 z- (5.46) 


Hence we can choose a function x in such a way that the divergence of the transformed vector-potential, 
A ’ s A + V x, would vanish, so that the new vector-potential would satisfy the vector Poisson equation 


V 2 A' = -// 0 j, 


(5.47) 


Poisson 
equation 
for A 


together with the so-called Coulomb gauge condition: 


This gauge is very convenient; one should, however, remember that the resulting solution A’(r) may 
differ from the function given by Eq. (28) - while field B remains the same. 18 


V- A' = 0. 


(5.48) 


Coulomb 

gauge 


In order to get a better feeling of vector-potential’s distribution in space, let us solve Eq. (47) for 
the same straight wire problem (Fig. 5). As Eq. (28) shows, in this case vector A has just one component 
(along the axis z). Moreover, due to the problem’s axial symmetry, its magnitude may only depend on 
the distance from the axis: A = n z A(p). Hence, the gradient of A is directed across axis z, so that Eq. (48) 
is satisfied even for this vector, i.e. the Poisson equation (47) is satisfied even for the original vector A. 
For our symmetry (d/dxp = d/dz = 0), the Laplace operator, written in cylindrical coordinates, has just 
one term, 19 reducing Eq. (47) to 


]_d_( dff 

p dp y dp J 


= -PJ(P) • 


(5.49) 


Multiplying both parts of this equation by p and integrating them over the coordinate once, we get 


16 See, e.g., MA Eq. (11.1). 

17 The use of term “gauge” (originally meaning “a measure” or “a scale”) in this context is purely historic, so the 
reader should not try to find too much hidden sense in it. 

18 Since most equations for A are valid for A’ as well, I will follow the common (possibly, bad) tradition, and in 
many cases use the same notation, A, for both functions. 

19 See, e.g., MAEq. (10.3). 
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dA 
P dp 


P 

-Mo\j(p')p'dp' + const, 
o 


(5.50) 


Since in the cylindrical coordinates, for our symmetry, 20 B = - dA/dp, Eq. (50) is nothing else than our 
old result (38) for the magnetic field. 21 However, let us continue the integration, at least for the region 
outside the wire, where the function A(p) depends only on the full current / rather than on the current 
distribution inside the wire. Dividing both parts of Eq. (50) by p, and integrating them over that 
coordinate again, we get 


a (p) = 


u n I , 

In p + const, 

2 n 


R 

where / = 2/rj j(p)pdp . 

o 


(5.51) 


As a reminder, we had the similar logarithmic behavior for the electrostatic potential outside a 
uniformly charged straight line. This is natural, because the Poisson equations for both cases are similar. 

Now let us find the vector-potential for the long solenoid (Fig. 6a), with its uniform magnetic 
field. Since Eq. (28) prescribes vector A to follow the direction of the current, we can start with looking 
for it in the form A = n (p A(p). (This is especially natural if the solenoid’s cross-section is circular.) With 
this orientation of A, the same general expression for the curl operator in cylindrical coordinates yields 
VxA = n : (\/p)d(pA)/dp. According to the definition (27) of A, this expression should be equal to B, in 
our case equal to n : B, with constant B - see Eq. (40). Integrating this equality, and selecting such 
integration constant so that ^4(0) is finite, we get 

4p)=‘y- (5-52) 

Plugging this result into the general expression for the Laplace operator in the cylindrical 
coordinates, 22 we see that the Poisson equation (47) with j = 0 (i.e. the Laplace equation), is satisfied 
again - which is natural since for this distribution, V-A = 0. However, Eq. (52) is not the unique (or 
even the simplest) solution of the problem. Indeed, using the well-known expression for the curl 
operator in Cartesian coordinates, 23 it is straightforward to check that either function A = n v Bx, or 
function A’ - -n x By, or any of their weighed sums, for example A’” = (A’ + A”)/2 = /i(-n x y + n y x)/2, 
also give the same magnetic field, and also evidently satisfy the Laplace equation. If such solutions do 
not look very natural due to their anisotropy in the [x, y ] plane, please consider the fact that they 
represent the uniform magnetic field regardless of its source (e.g., of the shape of long solenoid’s cross- 
section). Such choices of vector-potential may be very convenient for some problems, for example for 
the analysis of the 2D motion of a charged quantum particle in the perpendicular magnetic field, giving 
the famous Landau energy levels. 24 


20 See, e.g., MA Eq. (10.5) with dld<p = d/dz = 0. 

21 Since the magnetic field at the wire axis has to be zero (otherwise, being perpendicular to the axis, where would 
it be directed?), the integration constant in Eq. (50) should be zero. 

22 See, e.g., MAEq. (10.6). 

23 See, e.g., MA Eq. (8.5). 

24 See, e.g., QM Sec. 3.2. 
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5.3. Magnetic energy, flux, and inductance 

Considering currents flowing in a system as generalized coordinates, magnetic forces (1) 
between them are their unique functions, and in this sense the magnetic interaction energy U may be 
considered a potential energy of the system. The apparent (but deceptive) way to guess the energy is to 
use the analogy between Eq. (1) and its electrostatic analog, Eq. (2). As we know from Chapter 1, if 
these densities describe the distribution of the same charge, i.e. if p\r ) = p(r), then the self-interaction 
of its elementary components correspond to the potential energy expressed by Eq. (1.61): 


U = 


1 


4 7te n 



P(r)p(r') 
|r - r'l 


(5.53) 


Using the analogy, for the magnetic interaction between elementary components of the same current, 
with density j(r) = j ’(r), we could guess that 

(5.53) 

4;r2 J J |r-r'| 

while for independent currents the coefficient Vi should be removed. Now let me confess that this is a 
wrong way to get this correct result. Indeed, the sign in Eq. (1) is opposite to that in Eq. (2), so that 
following this argumentation we would get Eq. (53) with the minus sign. The reason of this paradox is 
fundamental: fixing electric charges does not require external interference (work), while the 
maintenance of currents generally does. Strictly speaking, a derivation of Eq. (53) required additional 
experimental fact, the Faraday induction law. However, I would like to defer its discussion until the 
beginning of the next chapter, and for now ask the reader to believe me that the sign in Eq. (53) is 
correct. 


Due to the importance of this relation, let us rewrite it in several other forms, beneficial for 
different applications. First of all, just as in electrostatics, Eq. (54) may be recast into a potential-based 
form. Indeed, using definition (28) of the vector-potential A(r), Eq. (54) becomes 25 

U = |jj(r)-A(rKV. (5.55) 


This formula, that is a clear magnetic analog of Eq. (1.62) of electrostatics, is very popular among 
theoretical physicists, because it is very handy for the field theory manipulations. However, for many 
calculations it is more convenient to have a direct expression of energy via the magnetic field. Again, 
this may be done very similarly to what we have done in Sec. 1.3 for electrostatics, i.e. plugging into Eq. 
(55) the current density expressed from Eq. (35) to transform it as 26 

U = - f j • Ad V = — [ A • (V x B )d 3 r = — f B • (V x A>/ 3 r — f V • (A x B)d 3 r . (5.56) 

2 J 2// 0 J 2// 0 J 2// 0 J 


Now using the divergence theorem, the second integral may be transformed into a surface integral of 
product (AxB)„. Equations (27)-(28) show that if the current distribution j(r) is localized, this product 
drops with distance r faster than Mr , so that if the integration volume is large enough, the surface 


25 This relation remains the same in the Gaussian units, because in those units both Eq. (28) and Eq. (54) should 
be stripped of their /uJAn coefficients. 

26 For that, we may use MA Eq. (1 1.7) with f = A and g = B, giving A-(VxB) = B-(VxA) - V-(AxB). 


Magnetic 
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Magnetic 

field 

energy 


Mutual 

inductance 

coefficients 


integral is negligible. In the remaining first integral, we may use Eq. (27) to recast VxA into the 
magnetic field. As a result, we get a very simple and fundamental fonnula. 


U = 


— f B 2 d 3 r. 
2/A) J 


(5.57a) 


Just as with the electric field, this expression may be interpreted as a volume integral of the magnetic 
energy density u: 


U = J w(r)c/ V, with «(r) 


1 

2/h) 


B 2 ( r )’ 


(5.57b) 


clearly similar to Eq. (1.67). 27 Again, the conceptual choice between the spatial localization of magnetic 
energy - either at the location of electric currents only, as implied by Eqs. (54) and (55), or in all regions 
where the magnetic field exists, as apparent from Eq. (57b), cannot be done within the framework of 
magnetostatics, and only electrodynamics gives the decisive preference for the latter choice. 


For the practically important case of currents flowing in several thin wires, Eq. (54) may be first 
integrated over the cross-section of each wire, just as was done at the derivation of Eq. (4). Again, since 
the integral of the current density over k th wire's cross-section is just the current I k in the wire, and 
cannot change along its length, it may be taken from the remaining integrals, giving 


U = 


/fo.1 
An 2 


Ev, 


k,k' 






(5.58) 


where / is the full length of the wire loop. Note that Eq. (58) is valid if currents I k are independent of 
each other, because the double sum counts each current pair twice, compensating coefficient Vi in front 
of the sum. It is useful to decompose this relation as 

(5.59) 


(5.60) 


Coefficient L in the quadratic form (59), with k ^ k ’, is called the mutual inductance between 
current loops k and k\ while the diagonal coefficient L k = L kk is called the self-inductance (or just 
inductance ) of k tb loop. 28 From the symmetry of Eq. (60) with respect to the index swap, k <-» k\ it 
evident that the matrix of coefficients Lkk’ is symmetric: 29 


U =\L VA'. 

^ k.k' 


A off dr k ■ dr c 


L k e= — H 


h-vl 


27 The transfer to the Gaussian units in Eqs. (77)-(78) may be accomplished by the usual replacement p 0 — > An, 
thus giving, in particular, u = B 2 /Sn. 

28 As evident from Eq. (60), these coefficients depend only on the geometry of the system. Moreover, in the 
Gaussian units, in which Eq. (60) is valid without the factor /uJAn, the inductance coefficients have the dimension 
of length (centimeters). The SI unit of inductance is called the henry, abbreviated H - after J. Henry, 1797-1878, 
who in particular discovered the effect of electromagnetic induction (see Sec. 6.1) independently of M. Faraday. 

29 Note that the matrix of the mutual inductances T ;; ■ is very much similar to the matrix of reciprocal capacitance 
coefficients p kk — for example, compare Eq. (62) with Eq. (2.21). 
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Lkk' ~ Lm 5 


(5.61) 


so that for the practically important case of two interacting currents I\ and h, Eq. (59) reads 

U = U^+MI,I 1 +U 1 ll, 


(5.62) 


where M = L\ 2 = L 2 \ is the mutual inductance coefficient. 

These formulas clearly show the importance of self- and mutual inductances, so I will 
demonstrate their calculation for at least a few basic geometries. Before doing that, however, let me 
recast Eq. (58) into one more form that may facilitate such calculations. Namely, let us notice that for 
the magnetic field induced by current I k in a thin wire, Eq. (28) is reduced to 

A,(r ) = ^-i k \J^- (5.63) 

4;r • |r - r k | 

so that Eq. (58) may be rewritten as 

(/ = ^E / d A *-( *»)•*,.. (5 64) 

^ k,k' 1 


But according to the same Stokes theorem that was used earlier in this chapter to derive the Ampere law, 
and Eq. (27), such integral is nothing more than the magnetic field flux (more frequently called just the 
magnetic flux ) through a surface S limited by the contour / : 30 


i 

II 

■— 

< 

(v*4^=. 

'B n d 2 r = ®. 


(5.65) 


As a result, Eq. (64) may be rewritten as 


u — y, i ki< 

4 k,k' 


(5.66) 


where is the flux of the field induced by k ’-th current through the loop of the k - th current. 
Comparing this expression with Eq. (59), we see that 



(5.67) 


This expression not only gives us one more means for calculating coefficients Lkt, but also 
shows their physical sense: the mutual inductance characterizes how much field (colloquially, “how 
many field lines”) induced by current It penetrate the loop of current fi, and vice versa. Since due to the 
linear superposition principle, the total flux piercing k - th loop may be presented as 


®k = 2 >«' = • 

k' k' 


(5.68) 


30 The SI unit of magnetic flux is called weber, abbreviated Wb - after W. Weber, who in particular co-invented 
(with C. Gauss) the electromagnetic telegraph, and in 1856 was first, together with R. Kohlrausch, to notice that 
the value of (in modem terms) \I{£q/Uq) V2 , derived from electrostatic and magnetostatic measurements, coincides 
with the independently measured speed of light c, giving an important motivation for Maxwell’s theory. 
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For example, for the system of two currents this expression is reduced to a clear analog of Eqs. (2.19): 


= L l l ] + MI 2 , 
®2 — Mf L 2 1 2 • 


(5.69) 


® and U 
of a 
single 
current 


For the even simpler case of a single current, 

0 = LI, 

so that the magnetic energy of the current may be presented in several equivalent forms: 


L , 

1 

1 2 

U = -I 2 

= -/® = 

®“. 

2 

2 

2 L 


(5.70) 

(5.71) 


These relations, similar to Eqs. (2.14)-(2.15) of electrostatics, show that the self-inductance L of a 
current loop may be considered as a measure of system’s magnetic energy at fixed current. 


Now we are well equipped for the calculation of inductances, having three options. The first one 
is to use Eq. (60) directly. 31 The second one is to calculate the magnetic field energy from Eq. (57) as 
the function of currents h in the system, and then use Eq. (59) to find all coefficients Lkk For example, 
for a system with just one current, Eq. (71) yields 


L = 


U 

I 2 12 ‘ 


(5.72) 


Finally, if the system consists of thin wires, so that the loop areas Sk and hence fluxes are well 
defined, we may calculate them from Eq. (65), and then use Eq. (67) to find the inductances. 

Actually, the first two options may have advantages over the third one even for such system of 
thin wires for whom the notion of magnetic flux is not quite clear. As an important example, let us find 
inductance of a long solenoid - see Fig. 6a. We have already calculated the magnetic field inside it - see 
Eq. (40) - so that, due to the field uniformity, the magnetic flux piercing each wire turn is just 

®, = BA = jiu 0 nIA , (5.73) 

where A is the area of solenoid’s cross-section - for example xR 2 for a round solenoid, though Eq. (40) is 
more general. Comparing Eqs. (73) and (67), one might wrongly conclude that L = ®i // = /j 0 nA 
[WRONG!], i.e. that the solenoid’s inductance is independent on its length. Actually, the magnetic flux 
®i pierces each wire turn, so that the total flux through the whole current loop, consisting of N turns, is 

® = A®, = ju () n 2 lA I , (5.74) 


and the correct expression for solenoid’s inductance is 

® t 

L = — = /u 0 }flA , 


(5.75) 


2 

i.e. the inductance per unit length is constant: L/l = ju^n A. Since this reasoning may seem a bit flimsy, it 
is prudent to verify it by using Eq. (72) to calculate the full magnetic energy inside the solenoid 
(neglecting minor fringe and external field contributions): 


31 Numerous applications of this Neumann formula to electrical engineering problems may be found, for example, 
in the classical text F. Grover, Inductance Calculations, Dover, 1946. 
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11 / 

U = B 2 Al = (jU 0 nlY Al - jU 0 n 2 IA — . 


2/C, 


2/^o 


(5.76) 


Plugging this result into Eq. (72) immediately confirms result (75). 

The use of the first two options for inductance calculation becomes inevitable for continuously 
distributed currents. As an example, let us calculate self-inductance L of a long coaxial cable with the 
cross-section shown in the Fig. 7. 32 



Fig. 5.7. Cross-section of a coaxial cable. 


Let us assume that the current is uniformly distributed over the cross-sections of both 
conductors. (As we know from the previous chapter, such distribution indeed takes place if both the 
internal and external conductors are made of a uniform resistive material.) First, we should calculate the 
radial distribution of the magnetic field (that of course has only one, azimuthal component, because of 
the axial symmetry of the problem). This distribution may be immediately found from the application of 
the Ampere law to circles of radii p within four different ranges: 


2npB = ju 0 I | 


piercing the circle area 


= //(,/ X 


p 


2 ’ 


2 _2 

£ ~P 

2 ,2 ’ 

c - b 


0, 


for p < a, 
for a < p < b, 
for b < p < c, 
for c < p. 


(5.77) 


Now, an elementary integration yields the magnetic energy per unit length of the cable: 


U 


-4 co 

= [ B 2 d 2 r = — [ B 2 pdp = 

2 Ao J Po o 


Pj 2 


l 2 p, 

Po_ 
In 


4 n 


", b c 2 

f 2 

C , C 1 

In — h — 

— ^ln 

2 i 2 

a c - b 

{c 2 -b 2 b 2) 


V 

a~ ; 

1 2 


lf 4 ] pdp+ \[^] pdp+ )\ 


c 2 -p 2 V 
pic 1 -b 2 ) 


pdp 


From here, and Eq. (72), we get the final answer: 

Ll — Ao 
/ In 


", b c 2 

( 2 

c , c 1 

In — h — 

— In 

a c -b~ 

(c 2 -Zr b 2) 


(5.78) 


(5.79) 


32 As a reminder, the mutual capacitance C between the conductors of such a system was calculated in Sec. 2.3. 
As will be discussed in Chapter 7 below, the pair of parameters L and C define the propagation of the most 
important, TEM mode of electromagnetic waves along the cable. 
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Note that for the particular case of a thin outer conductor, c - b«b, this expression reduces to 


/ 2 n 


( 

In 

V 


b n 

a 4) 


(5.80) 


where the first term in the parentheses may be traced back to the contribution of the magnetic field 
energy in the free space between the conductors. This distinction is important for some applications, 
because in superconductor cables, as well as resistive-metal cables as high frequencies (to be discussed 
in the next chapter), the field does not penetrate the conductor bulk, so that Eq. (80) is valid without the 
last term, 1/4, in the parentheses, which is due to the magnetic field energy inside the wire. 

As the last example, let us calculate the mutual inductance between a long straight wire and a 
round wire loop adjacent to it (Fig. 8), neglecting the thickness of both wires. 



0 



Fig. 5.8. Study case for the 
mutual inductance calculation. 


Here there is no problem with using the last formalism, based on the magnetic flux calculation. 
Indeed, in the Cartesian coordinates shown in Fig. 8, Eq. (20) reads B\ = fioI\/2m\ giving the following 
magnetic flux through the round wire loop: 


$21 = 


2 n 


+R 


\dx J 

~R (r,2 


R+[r 2 -x 2 J ' 2 


1 


R-(r 2 -x 2 J 


dv— = 

n y 


M. LMg 


2 -x' 


n 




?VmN, j+MT 


n 


f In 1 + ^ ~ ^ ’ 

i i+(w 2 ) 1/2 


d£. (5.81) 


This is a table integral equal to Tip so that 0 2 i = /uohR, and the final answer for the mutual inductance 
M= Ln = 1/21 = 0 2 |//i is finite (and very simple): 


M = ju 0 R, 


(5.82) 


despite magnetic field's divergence at the lowest point of the loop (y = 0). Note that in contrast with the 
finite mutual inductance of this system, se/f-inductances of both wires are formally infinite in the thin- 
wire limit - see, e.g., Eq. (80), that in the limit b/a » 1 describes a thin straight wire. However, since 
this divergence is very weak (logarithmic), it is quenched by any deviation from this perfect geometry. 
For example, a good estimate of the inductance of a wire of a large but finite length / may be obtained 
from Eq. (81) via the replacement of b with I: 

Z~^/ln-. (5.83) 

2 n a 


33 See, e.g., MA Eq. (6.13) for a = 1. 
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(Note, however, that the exact result depends on where from/to the current flows beyond that segment. ) 
A close estimate, with / replaced with 2 nR, and b replaced with R, is valid for the self-inductance of the 
round loop. A more exact calculation of this inductance, which would be asymptotically correct in the 
limit a « R, is a very useful exercise, which is highly recommended to the reader. 34 


5.4. Magnetic dipole moment, and magnetic dipole media 

The most natural way of description of magnetic media parallels that described in Chapter 3 for 
dielectrics, and is based on properties of magnetic dipoles. To introduce this notion quantitatively, let us 
consider, just as in Sec. 3.1, a spatially-localized system with current distribution j(r), whose magnetic 
field is measured at relatively large distances r » r’ (Fig. 9). 



Applying the truncated Taylor expansion (3.4) to definition (28) of the vector potential, we get 


A(r) * ^ 
4 n 


- f j(r VV + \ f (r • r')\(r')d V 

y J J 


(5.84) 


Due to the vector character of this potential, we have to depart slightly from the approach of Sec. 3.1 
and use the following vector algebra identity: 35 

{ [/( j • Vg) + g( j • V/)] d V = 0 (5.85) 


that is valid for any pair of smooth (differentiable) scalar functions fir) and g(r), and any vector function 
j(r) that, as the dc current density, satisfies the continuity condition V-j = 0 and whose normal 
component vanishes on its surface. 


First, let us use Eq. (85) with / = 1 and g equal to any component of the radius-vector r: g = r, (i 
= 1,2, 3). Then it yields 


j(yn i )d 3 r = jj i d 3 r=0, (5.86) 

V V 


so that for the vector as the whole 


34 Its solution may be found, for example, just after Sec. 34 of L. Landau et al., Electrodynamics of Continuous 
Media, 2 nd ed., Butterwort Heinemann, 1984. 

35 See, e.g., MA Eq. (12.3) with additional condition j n \ s = 0, pertinent for space-restricted currents. 
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m vs. L 


J j( r K V = 0 , 


(5.87) 


showing that the first term in the right-hand part of Eq. (84) equals zero. Next, let us use Eq. (85) with / 
= r t , g = r, ■ (/, i’ = 1,2, 3); then it yields 


\{ r Jv +r e jt)d 3 r = 0, 


(5.88) 


so that the i tb Cartesian component of the second integral in Eq. (84) may be transformed as 
|(r ■ r ">;,</ V = J £ -f£r,J(rV/ + rUW 

^ )7 ('=1 ^ /'=1 ^ 

= = rxjVx ])d 3 

2,-=i 7 2 L r 

As a result, Eq. (85) may be rewritten as 


'V 


where vector m, defined as 36 


(5.89) 



(5.90) 


(5.91) 


is called the magnetic dipole moment of our system - that itself, within approximation (90), is called the 
magnetic dipole. 

Note a close analogy between m and the angular momentum of a nonrelativistic particle with 
mass mp. 

L * sr * x P k= r k *m k x k , (5.92) 

where p/, = mp/k is its mechanical momentum. Indeed, for a continuum of such particles with the same 
electric charge q, with the spatial density n, j = qnx, and Eq. (91) yields 


f 1 -j3 ! n q j 3 

m = J— rxja r = J— ^ -rxxd r . 


(5.93) 


while the total angular momentum of such continuous system of particles of the same mass ( >«a- = md) is 

L = J nm 0 r x \d 3 r , 

V 

so that we get a very straightforward relation 


m 


2 m n 


L . 


(5.95) 


36 In the Gaussian units, definition (91) is kept valid, so that Eq. (90) is stripped of the factor pdAn. 
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For the orbital motion, this classical relation survives in quantum mechanics for operators and hence for 
eigenvalues, in whom the angular momentum is quantized in the units of the Plank’s constant ti, so that 
for an electron, the orbital magnetic moment is always a multiple of the so-called Bohr magneton 



(5.96) 


where m e is the free electron mass. 37 However, for particles with spin, such a universal relation between 
vectors m and L is no longer valid. For example, electron’s spin s = Vi gives contribution fill to the 
mechanical momentum, but its contribution to the magnetic moment it still very close to // B . 38 


The next important example of a magnetic dipole is a planar wire loop limiting area A (of an 
arbitrary shape), carrying current /, for which m has a surprisingly simple form, 


m = IA, 


(5.97) 


where the modulus of vector A equals area A, and its direction is perpendicular to loop’s plane. This 
formula may be readily proved by noticing that if we select the coordinate origin on the plane of the 
loop (Fig. 10), then the elementary component of the magnitude of integral (91), 



2 

is just the elementary area dA = (1/2 )rdh = ( \ I2)rd(rd\r\(p) = r~d(p!2. 


(5.98) 



The combination of Eqs. (96) and (97) allows a useful estimate of the scale of atomic currents, 
by finding what current / should flow in a circular loop of atomic size scale (the Bohr radius) r B « 
0.5xl0' 10 m, i.e. of area A « 10' 20 m 2 , to produce a magnetic moment equal to //b - 39 The result is 
surprisingly macroscopic: 7~ 1 mA (quite comparable to the currents driving your earbuds :-). Though 
this estimate should not be taken too literally, due to the quantum-mechanical spread of electron's 
wavefunctions, it is very useful for getting a feeling how significant the atomic magnetism is and hence 
why ferromagnets may provide such a strong field. 


37 In SI units, m e « 0.91xl0' 30 kg, so that p B ~ 0.93xl0' 23 J/T. 

38 See, e.g., QM Sec. 4.1 and beyond. 

39 Another way to arrive at the same estimate is to take I ~ ef = ecolliz with co ~ 10 16 s' 1 being the typical 
frequency of radiation due to atomic interlevel quantum transitions. 
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After these illustrations, let us return to Eq. (90). Plugging it into the general fonnula (27), we 
may calculate the magnetic field of a magnetic dipole: 

Magnetic 
dipole’s 
field 



(5.99) 


Magnetic 
dipole 
in external 
field 


The structure of this fonnula exactly duplicates that of Eq. (3.15) for the electric dipole field. Because of 
this similarity, the energy of a dipole in an external field, and hence the torque and force exerted on it by 
the field, are also absolutely similar to the expressions for an electric dipole - see Eqs. (3 . 1 5)-(3 .18): 


U = -m • B 


ext ’ 


(5.100) 


and as a result, 


T = mxB ex,> (5.101) 

F = V(m -B ext ) . (5.102) 


Now let us consider a system of many magnetic dipoles (e.g., atoms or molecules), distributed in 
space with density n. Then we can use Eq. (90) (generalized in the evident way for an arbitrary position, 
r ’, of a dipole), and the linear superposition principle, to calculate the “macroscopic” component of the 
vector-potential A - in other words, dipole's potential averaged over short-scale variations on the inter- 
dipole distances: 



M(r')x (r -r') 




(5.103) 


where M = nm is the macroscopic (average) magnetization, i.e. the magnetic moment per unit volume. 
Transforming this integral absolutely similarly to how Eq. (3.27) had been transformed into Eq. (3.29), 
we get: 



V'xM(r') 

|r-r'| 


? V 


(5.104) 


Comparing this result with Eq. (28), we see that VxM is equivalent, in its effect, to the density 
j e f of a certain effective “magnetization current”. Just as the electric-polarization “charge” p C f discussed 
in Sec. 3.2 (see Fig. 3.3), j e f = VxM may be interpreted the uncompensated part of vortex currents 
representing single magnetic dipoles (Fig. 1 1). 



Fig. 5.11. Cartoon illustrating the physical nature of 
the “magnetization current” j ef = VxM. 
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Now, using Eq. (28) to add the possible contribution from “stand-alone” currents j, not included 
into the currents of microscopic dipoles, we get the general equation for the vector-potential of the 
macroscopic field: 

A(r) = /^[M±IE 

4 n* |r-r' 

Repeating the calculations that have led us from Eq. (28) to the Maxwell equation (35), with the account 
of the magnetization current term, for the macroscopic magnetic field B we get 40 

VxB = // 0 (j + VxM). (5.106) 


M(r')] 


d 3 


(5.105) 


Following the same philosophy as in Sec. 3.2, we may recast this equation as 

V x H = j , 


(5.107) 


where a new field defined as 


H = — -M, 
Mo 


(5.108) 


Magnetic 
field H 


by historic reasons (and very unfortunately) is also called the magnetic field. 41 It is crucial to remember 
that the physical sense of field H is very much different from field B. In order to understand the 
difference better, let us use Eq. (107) to complete a macroscopic analog of system (36), called the 
macroscopic Maxwell equations (again, so far for the stationary case d/dt = 0): 


V x E = 0, 

V x H = j, 

V-D = p. 

V • B = 0. 


(5.109) 


Stationary 

macroscopic 

Maxwell 

equations 


One can clearly see that the roles of vector fields D and H are very similar: they could be called “would- 
be” fields - which would be induced by stand-alone charges and currents, if the media had not modified 
them by its dielectric and/or magnetic polarization. 


40 Similarly to the situation with the electric dipoles (see Eq. (3.24) and its discussion), it may be shown that the 
magnetic field of any closed current loop (or any system of such loops) satisfies the following equality: 

|B(r)<i 3 r =(2/3)// 0 m, 

r<R 


where the integral is over any sphere confining all the currents. On the other hand, for field (99), derived from the 
asymptotic approximation (90), such integral vanishes. In order to get a course-grain description of the magnetic 
field of a small system located at r = 0, which would be valid everywhere (though at r ~ a, only approximately), 
Eq. (99) should be modified as follows: 


B 


(r) = — 
; An 


3r(r m)-mr 8 n _/ \ 

— i + — m£(r) 

r 3 


Hence, strictly speaking, the macroscopic field B participating in Eq. (106) and beyond is the average long-range 
field of the magnetic dipoles (plus of the stand-alone currents j) rather than the genuine average magnetic field. 

41 This confusion is exacerbated by the fact that in Gaussian units, Eq. (108) has the form H = B - 4;zM, and 
hence fields B and H has one dimensionality (and are equal in free space!) - though the unit of H has a different 
name ( oersted , abbreviated as Oe). Mercifully, in the SI units, the dimensionality of B and H is different, with the 
unit of H being called ampere per meter. 
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Despite this similarity, let me note an important difference of signs in the relation (3.33) between 
E, D, and P, on one hand, and relation (108) between B, H, and M, on the other hand. It is not just the 
matter of definition. Indeed, due to the similarity of Eqs. (3.15), and (100), including similar signs, the 
electric and magnetic fields both try to orient the corresponding dipole moments along the field. Hence, 
in the media that allow such orientation (and as we will see momentarily, for magnetic media it is not 
always the case), the induced polarizations P and M are directed along, respectively, vectors E and B. 
According to Eq. (3.33), if the would-be field D is fixed - say, by a fixed stand-alone charge 
distribution p{ r) - such polarization reduces the genuine average electric field E = (D - P)/£o- On the 
other hand, Eq. (108) shows that in a magnetic media with fixed would-be field H, magnetic 
polarization with M IT B enhances the average magnetic field B = (H + M )/// 0 . This difference may be 
traced back to the sign difference in the initial relations (1.1) and (5.1), i.e. to the basic fact that charges 
of the same sign repulse, while currents of the same direction attract each other. 

In order to form a complete system of differential equations, the macroscopic Maxwell equations 
(109) have to be complemented with “material relations” D <-» E,j <-» E, and B <-» H. In previous two 
chapters we already discussed, in brief, two of them; let us proceed to the last one. 


5.5. Magnetic materials 


Magnetic 

permeability 


A major difference between the dielectric and magnetic material equations D(E) and B(H) is that 
while a typical dielectric media reduces the external electric field, magnetic media may either reduce or 
enhance it. In order to quantify this fact, let us consider the so-called linear magnetics in which M (and 
hence H) are proportional to B. Just as in dielectrics, in material without spontaneous magnetization, 
such linearity at relatively low fields follows from the Taylor expansion of function M(B). For isotropic 
materials, this proportionality is characterized by a scalar - either the magnetic permeability //, defined 
by the following relation: 


B = //H, 


(5.110) 


Magnetic 

susceptibility 


or the magnetic susceptibility 42 dcf\ncd as 

M = X„H 


(5.111) 


Plugging these relations into Eq. (108), we see that these two parameters are not independent, but are 
related as 


Xm VS. H 


A = (l + X„)Ao- 


(5.112) 


Note that despite the superficial similarity between Eqs. (llO)-(lll) and relations (3.35)-(3.38) 
for linear dielectrics: 


42 According to Eq. (110) (i.e. in SI units), % m is dimensionless, while p has the same the same dimensionality as 
po. In the Gaussian units, p is dimensionless, (//)Gaussian = (p)si/po, and is also introduced differently, as p = 1 + 
4 jtXrm Hence, just as for the electric susceptibilities, these dimensionless coefficients are different in the two 
systems: (% m )si = 4^(j /H )Gaussian- Note also that % m is formally called the volume magnetic susceptibility, in order to 
distinguish it from the molecular susceptibility % defined by a similar relation, m = /H, where m is the average 
induced magnetic moment of a single dipole - e.g., a molecule. Evidently, in a dilute medium, i.e. in the absence 
of substantial dipole-dipole interaction, /,„= n% , where n is the dipole density. 
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D = ^E, P = j e £ 0 E, e = (l + z e ) e 0 , (5.113) 

there is an important conceptual difference between them. Namely, while vector E in the right-hand 
parts of Eqs. (1 13) is the real (average) electric field, vector H in the right-hand part of Eqs. (11 0)-( 1 1 1) 
represents a “would-be” magnetic field, in all aspects similar to vector D rather than E. For relatively 
dense media, whose polarization may affect the genuine fields substantially, this difference between 
parameters s and // may make their properties (e.g., the Kramers-Kronig relations, to be discussed in 
Sec. 7.3) rather different. 

Another difference between parameters s and // (and hence between % e and % m ) is evident from 
Table 1 which lists the values of magnetic susceptibility for several materials. It shows that in contrast 
to linear dielectrics whose susceptibility % e is always positive, i.e. the dielectric constant e r = % e + 1 is 
always larger than 1 (see Table 3.1), linear magnetics may be either paramagnets (% m > 0. i. c. // > //o) or 
diamagnets (% m < 0, ju> juo). 


Table 5.1. Magnetic susceptibility (/„,)si of a few representative (and/or important) materials (a| 


“Mu-metal” (75% Ni + 15% Fe + a few %% of Cu and Mo) 

~20,000 (b) 

Pennalloy (80% Ni + 20% Fe) 

~8,000 (b) 

“Soft” (or “transformer”) iron 

~4,000 (b) 

Nickel 

-100 

Aluminum 

+2x1 0' 5 

Diamond 

-2x1 0' 5 

Copper 

-7x1 0' 5 

Water 

-9x1 O' 6 

Bismuth (the strongest non-superconducting diamagnet) 

-1.7x1 O' 4 


(a) The table does not include bulk superconductors, which in a crude (“macroscopic”) 
approximation may be described as perfect diamagnets (with B = 0, i.e. % m = -1 and p = 0), though the 
actual physics of this phenomenon is more complex - see Sec. 6.3 below. 

(b) The exact values of Xm for soft ferromagnetic materials depend not only on their exact 
composition, but also on their thermal processing (“annealing”). Moreover, due to unintentional 
vibrations, the extremely high Xm of such materials may somewhat decay with time, though may be 
restored to approach the original value by new annealing. 


The reason of this difference is that in dielectrics, two different polarization mechanisms 
(schematically illustrated by Fig. 12) lead to the same sign of the average polarization. The first of them 
takes place in atoms without their own spontaneous polarization. A crude classical image of such an 
atom is an isotropic cloud of negatively charged electrons surrounding a positively charged nucleus - see 
Fig. 12a. The external electric field shifts the positive charge in the direction of E, and negative charges 
in the opposite direction, thus creating a dipole with aligned vectors p and E, and hence positive 
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polarizability a mo \ - see Eq. (3.39). As a result, the electric susceptibility is also positive - see Eqs. 
(3.41) or (3.71). 

In the second case (Fig. 12b) of a gas or liquid consisting of polar molecules, each molecule has 
its own, spontaneous dipole moment po even in the absence of external electric field. (A typical example 
is a water molecule FEO, with the positive oxygen ion positioned out of the line connecting two positive 
hydrogen atoms, thus producing a spontaneous dipole with moment’s magnitude po ~ ex0.38xl0" 10 m.) 
However, in the absence of the applied electric field, the orientation of such dipoles is random, so that 
the average polarization P = n( po) equals zero. A weak applied field does not change the magnitude of 
the dipole moments significantly, but creates their preferential orientation along the field (in order to 
decrease the potential energy U = -po-E), thus creating a nonvanishing vector average (po) directed 
along E. If the applied field is not two high (p 0 E « k B T), the induced polarization P = n{ po) is 
proportional to E, again giving a positive polarizability a mo i. 43 


(a) 
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Fig. 5.12. Cartoons of two types 
of induced electrical polarization: 
(a) elementary dipole induction 
and (b) partial ordering of 
spontaneous elementary dipoles. 


Returning to magnetics, the second of the above mechanisms, i.e. the ordering of spontaneous 
dipoles by the applied field, is responsible for the paramagnetism. Again, now according to Eq. (100), 
such field tends to align the dipoles along its direction, so that the average direction of spontaneous 
elementary moments mo, and hence the direction of M, is the same as that of the average field B (i.e., 
for a diluted media, of H « B/// 0 ), resulting in a positive susceptibility % m . However, in contrast to the 
electric polarization, there is a mechanism of magnetic polarization, called the orbital (or “Larmor” 44 ) 
diamagnetism, which gives % m < 0. As its simplest model, let us consider the orbital motion of an 
atomic electron as classical particle of mass mo, with electric charge q, about an immobile attractive 
center - modeling the atomic nucleus. As classical mechanics tells us, the central attractive force does 


43 The proportionality of |(p 0 )| (and hence P) to is is a result of a dynamic balance between the dipole-orienting 
torque (101) and disordering thermal fluctuations. A qualitative description of such balances is one of the main 
tasks of statistical mechanics - see, e.g., SM Chapters 2 and 4. However, the very fact of proportionality P r r E in 
low fields may be readily understood as the result of the Taylor expansion of function P(E) at E — >0. 

44 After J. Larmor (1857 - 1947) who first described the torque-induced precession mathematically. 
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not change particle’s angular momentum L = /Horxv, but the applied magnetic field B (that may be taken 
uniform on the atomic scale) does, due to the torque (101) it applies to magnetic moment (95): 

— = T = mxB = -^-LxB. (5.114) 

dt 2 tn 0 

The vector diagram in Fig. 13 shows that in the limit of relatively weak field, when the 
magnitude of the angular momentum L may be considered constant, this equation describes the rotation 
(called the torque-induced precession 45 ) of vector L about the direction of vector B, with angular 
frequency Q = -qB/2m 0 , independent on angle 6. Let me leave for the reader to use Eq. (114) for 
checking that, irrespectively the sign of charge q, the resulting additional magnetic moment Am has a 
direction opposite to that of vector B, and hence Xm is negative, leading to the Lannor diamagnetism. 46 



Fig. 5.13. Torque-induced precession of 
a charged particle in a magnetic field. 


An important conceptual question is what exactly prevents the initial magnetic moment m that, 
according to Eq. (95), is associated with the angular momentum L of the electron, from turning along 
the magnetic field, just as in the second polarization mechanism illustrated by Fig. 12b - thus decreasing 
the potential energy (100) of the system. The answer is the same as for the usual mechanical top - it 
“wants” to fall due to the gravity field, but cannot do that due to the mechanical inertia. In classical 
physics, even a small friction (dissipation) eventually drains top’s rotational kinetic energy, and it falls. 
However, in quantum mechanics the ground-state “motion” of electrons in an atom is not subjected to 
friction, because they cannot be brought to full rest due to Heisenberg’s uncertainty principle. Somewhat 
counter-intuitively, the magnetic moments due to such fully-quantum effect as spin are much more 
susceptible to interaction with environment, so that in atoms with uncompensated spins, the magnetic 
dipole orientation mechanism prevails over the orbital diamagnetism, and the materials incorporating 
such atoms usually exhibit net paramagnetism - see Table 1. 

Due to possible strong interactions between elementary dipoles, magnetism of materials is an 
extremely rich field of physics, with numerous interesting phenomena and elaborated theories. 


45 For a detailed discussion of the effect see, e.g., CM Sec. 6.5. 

46 The quantum- mechanical treatment (see, e.g., QM Sec. 6.4) confirms this qualitative picture, while giving 
quantitative corrections to the classical result for Xm- 
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Unfortunately, all this physics is well outside the framework of this course, and I have to refer the 
interested reader to special literature, 47 but still need to mention its key notions. 

Most importantly, a sufficiently strong dipole-dipole interaction may lead to their spontaneous 
ordering, even in the absence of the applied field. This ordering may correspond to either parallel 
alignment of the atomic dipoles {ferromagnetism ) or anti-parallel alignment of the adjacent dipoles 
{antiferromagnetism). Evidently, the external effects of ferromagnetism are stronger, because such 
phase corresponds to a substantial spontaneous magnetization M. (This value is frequently called the 
saturation magnetization, M v , while the corresponding magnitude of B = // 0 M is called either the 
saturation magnetic field, or the remanence field, B«). The direction of B« may switched by the 
application an external magnetic field, with a magnitude above certain value He called coercivity, 48 
leading to the well- kn own hysteretic loops on the [B, H] plane - see Fig. 14 for a typical example. 


Fig. 5.14. Experimental magnetization 
curves of specially processed (cold-rolled) 
transformer steel, i.e. a solid solution of 
-10% C and - 6% Si in Fe. (Adapted 
from www.thefullwiki.org/Flvsteresis .) 

-150 -100 -50 0 50 100 150 

In relatively low fields, H « He, such materials may be described as hard (or “permanent”) 
ferromagnets ; at such approximate treatment, magnetization M is considered constant. On the other 
hand, the theory needed for a fair description phenomena at H - He is rather complicated. Indeed, the 
direction of magnetization of crystals may be affected by the anisotropy of the crystal lattice. Because 
of that, typical non-crystalline ferromagnetic materials (like steel, permalloy, “mu-metal”, etc.) consist 
of randomly oriented magnetic domains, each with certain spontaneous magnetization direction. The 
magnetic interaction of the domain with its neighbors and the external field determines the evolution of 
its magnetization and hence the average magnetic properties of the ferromagnet. In particular, such 
interaction explains why the hysteresis loop shape is dependent on the cycled field amplitude and 
cycling history - see Fig. 14. A very important class of multi-domain materials is the so-called soft 
ferromagnets, whose coercivity is relatively low. At low cycled field amplitude, the soft ferromagnets 
behave, on the average, as linear magnetics with very high values of % m and hence // (see the top rows of 
Table 1, and Fig. 14) that are highly dependent on the material’s fabrication technology and its post- 
fabrication thermal and mechanical treatments. 



47 See, e.g., D. J. Jiles, Introduction to Magnetism and Magnetic Materials, 2 nd ed., CRC Press, 1998, or R. C. 
O’Flandley, Modern Magnetic Materials, Wiley, 1999. 

48 Materials with very high coercivity H c are frequently called hard ferromagnets or permanent magnets. 
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High values of Xm are is also pertinent to magnetics in which the molecular dipole interaction is 
relatively weak, so that their ferromagnetic ordering may be destroyed by thermal fluctuations, if 
temperature is increased above the so-called Curie temperature T c . At T > T c , such materials behave as 
paramagnets, with susceptibility obeying the Curie-Weiss law 

(At vanishing moment interaction, Tq — > 0, and Eq. (115) is reduced to the Curie law Xm cx - 1 IT typical 
for weak paramagnets.) The transition between the ferromagnetic and paramagnetic phase at T = Tq is 
the classical example of continuous phase transitions, similar to that between the paraelectric and 
ferroelectric phases of a dielectric. In both cases, the “macroscopic” (average) polarization - either M or 
P - plays the role of the so-called order parameter that (in the absence of external fields) appears at T = 
Tq and increases gradually at the further reduction of temperature. 49 

Before returning to magnetostatics per se, I have to mention the large practical role played by 
hard ferromagnetic materials (well beyond refrigerator magnets :-). Indeed, despite the decades of the 
exponential ( Moore ’s-law) progress of semiconductor electronics, most computer data storage systems 
are still based on the hard disk drives whose active medium is a submicron-thin ferromagnetic layer, 
with bits stored in the form of the direction of the spontaneous magnetization of small film spots. This 
technology has reached a fantastic sophistication, 50 with recording data density approaching 10 ~ bits 
per square inch. Only recently it has started to be seriously challenged by the so-called solid state drives 
based on the flash semiconductor memories already mentioned in Chapter 3. 


5.6. Systems with magnetics 

Similarly to the electrostatics of linear dielectrics, magnetostatics of linear magnetics is very 
simple in the particular case when the stand-alone currents are deeply embedded into a medium with a 
constant penneability p. Indeed, in this case, boundary conditions on the distant surface of the media do 
not affect the solution of the boundary problem described by the magnetic equations of the macroscopic 
Maxwell system (109). Now let us assume that we know the solution B 0 (r) of the magnetic pair of the 
genuine (“microscopic”) Maxwell equations (36) in free space, i.e. when the genuine current density j 
coincides with that of stand-alone currents. Then the macroscopic equations and the material equation 
(110) are completely satisfied with the pair of functions 

H (r ) = B(r) = //H(r) = — B 0 (r). (5.116) 

Ao Ao 

Hence the only effect of a complete filling a system of fixed currents with a uniform, linear 
magnetic is the increase of the magnetic field B at all points by the same constant factor plpo = 1 + Xm- 
(As a reminder, a similar filling of a system of fixed charges with a unifonn, linear dielectric leads to a 
reduction of the electric field E by factor el a, = e r = 1 + Xe-) 

49 A discussion of such transitions may be found, in particular, in SM Chapter 4. 

50 “A magnetic head slider [the read/write head - KKL] flying over a [rather uneven - KKL] disk surface with a 
flying height of 25 nm with a relative speed of 20 meters/second is equivalent to an aircraft flying at a physical 
spacing of 0.2 pm at 900 kilometers/hour .” B. Bhushan, as quoted in a (generally good) book by G. Hadjipanayis, 
Magnetic Storage Systems Beyond 2000, Springer, 2001. 
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However, this simple result is generally invalid in the case of non-uniform (or piece-wise 
uniform) magnetic samples. Theoretical analyses of magnetic field distribution in such non-uniform 
systems may be facilitated by two additional tools. First, integrating the macroscopic Maxwell equation 
(107) along a closed contour C limiting a smooth surface S, and using the Stokes theorem, we get the 
macroscopic version of the Ampere law (37): 

(5.117) 


This is exactly the replica of the “microscopic” equation Eq. (37), with the replacement B///o — > H. 

Let us apply this relation to a boundary between two regions with constant, but different p, with 
no stand-alone currents on the border, similarly how this was done for field E in Sec. 3.4 - see Fig. 3.5. 
The result is similar as well: 


Macroscopic 

Ampere 

law 



H t = const. (5.118) 

On the other hand, the integration of the Maxwell equation (29) over a Gaussian pillbox enclosing a 
border fragment (again similar to that shown in Fig. 3.5) yields the result similar to Eq. (3.46): 

B n = const, i.e. juH n = const . (5.119) 

Let us use these boundary conditions, first, to see what happens with a thin sheet of magnetic 
material (or any other strongly elongated sample) placed parallel to a uniform external field Ho. Such 
sample cannot noticeably disturb the field in the free space outside it: H ex t = H 0 , B ext = H cxt //vo = H 0 ///o- 
Now applying Eq. (118) to the dominating, large-area interfaces, we get H irU = Ho, i.e., Bj nt = ( /////o ) B 0 . 51 
The fact of constancy of field H in this geometry explains why this field is used as the horizontal axis in 
plots like Fig. 14: such measurements are typically carried out by placing an elongated sample of the 
material into the uniform field - say the one produced by a long solenoid. 

Samples of other geometries may create strong perturbations of the external field, extended to 
distances of the order of the transversal dimensions of the sample. In order to analyze such problems, we 
may benefit from a simple, partial differential equation for a scalar function, e.g., the Laplace equation, 
because in Chapter 2 we have learned how to solve it for many simple geometries. In magnetostatics, the 
introduction of a scalar potential is generally impossible due to the vortex-like magnetic field lines, but 
if there are no stand-alone currents within the region we are interested in, then the Maxwell equation 
(32) for field H is reduced to V x H = 0, and we may introduce the scalar potential of the magnetic field, 
<p m , using the relation similar to Eq. (1.33): 

H = -V (p m . (5.120) 

Combining it with the homogenous Maxwell equation for magnetic field, V B = 0, we arrive at the 
familiar differential equation, 

V ■ (//V <p n . ) - 0 , (5.121) 

that, for a uniform media (ju = const), is reduced to our beloved Laplace equation. Moreover, Eqs. (118) 
and (119) give the very familiar boundary conditions: first 


51 The reader is highly encouraged to carry out a similar analysis of fields inside narrow gaps cut in a linear 
magnetic, similar to that carried out for linear dielectrics in Sec. 3.3 - see Fig. 3.6 and its discussion. 
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which is equivalent to 


and also 


dr 


= const , 


(j) m = const , 
d(b 

u — — = const . 
dn 


(5.122a) 


(5.122b) 


(5.123) 


Note that these boundary conditions are similar for (3.46) and (3.47) of electrostatics, with the 
replacement e — » /u. 51 


Let us analyze the geometric effects on magnetization, using the (too?) familiar structure: a 
sphere, made of a linear magnetic material, in a uniform external field. Since the differential equation 
and boundary conditions are similar to those of the similar electrostatics problem (see Fig. 3.8), we can 
use the above analogy to recycle the solution we already have got - see Eqs. (3.55)-(3.56). Just as in the 
electric case, the field outside the sphere, with potential 



= H n 


— r + 


M -M o 
M + r 2 , 


cos 0, 


(5.125a) 


is a sum of the uniform external field Ho and the dipole field (99) with the following induced magnetic 
dipole moment of the sphere: 53 


m = 4x M Mo R 2 YL 0 . 

V + 2 / 6 , 

On the contrary, the internal field is perfectly unifonn: 

M <r =~H 0 3// ° r cos 0, = 3// ° , jiBL = ifgaL = . 

M + 2// 0 H o M + lMo B, ju 0 H 0 jU + 2ju 0 


(5.125b) 


(5.126) 


Note that though H inside the sphere is not equal to its value of the external field Ho. This 
example shows that the interpretation of H as the “would-be” magnetic field generated by external 


52 This similarity may seem strange, because earlier we have seen that parameter // is physically more similar to 
1 Is. The reason for this paradox is that in magnetostatics, the introduced potential (j) m is traditionally used to 
describe the “would-be field” H, while in electrostatics, potential <j> describes the real (average) electric field E. 
(This tradition persists from the old days when H was perceived as a genuine magnetic field.) 

53 Instead of differentiating the (j) m given by Eq. (125a), we may use the absolute similarity of Eqs. (3.13) and 
(99), to derive from Eq. (3.17) a similar expression for the magnetic potential of an arbitrary magnetic dipole: 

, 1 m cos 6 


Now comparing this formula with the second term of Eq. (125a), we immediately get Eq. (125b). 
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currents j should not be exaggerated into saying that its distribution is independent on the magnetic 
bodies in the system. 54 


In the limit // » // 0 , Eqs. (126) yield H\JHq « 1, B ml /flo = 3//o, the factor 3 being specific for 
the particular geometry of the sphere. If a sample is stretched along the applied field, this limitation of 
the field concentration is gradually removed, and B mt tends to its maximum value juH 0 » B ex t , as was 
discussed above. This effect of “magnetic line concentration” in high -/u materials is used in such 
practically important devices as transformers, in which two multi-turn coils are wound on a ring-shaped 
(e.g., toroidal, see Fig. 6b) core made of a soft ferromagnetic material (such as the transformer steel, see 
Table 1) with /u » //q. This minimizes the number of “stray” field lines, and makes the magnetic flux ® 
piercing each wire turn (of either coil) virtually the same - the equality important for secondary voltage 
induction - see the next chapter. 


Field 
energy in a 
linear 
magnetic 


The second theoretical tool, frequently useful for problem solution, is a macroscopic expression 
for magnetic field energy U. For a system with linear magnetic materials, we may repeat the 
transformation of Eq. (55), made in Sec. 3, but with due respect to the magnetization, i.e. replacing j not 
from Eq. (56), but from Eq. (107). As a result, instead of Eq. (57) we get 


tt fib 3 B-H B 2 iH 2 

U = u{ r )a r, with u = = — = - , 

y 2 2/u 2 


(5.127) 


This result is evidently similar to Eq. (3.79) of electrostatics. 


General 

magnetic 

energy 

variation 


Gibbs 

potential 

energy 


For the general case of nonlinear magnetics, calculations similar to those resulting in Eq. (3.82) 
give the following analog of that relation: 


du = H • SB , 


(5.128) 


for a linear magnetic yielding Eq. (127). Similarly to the electrostatics of dielectrics, we may argue that 
according to Eq. (128), in systems with magnetic media, H plays the role of the generalized force, and B 
of the generalized coordinate (per unit volume). 55 As the result, the Gibbs potential energy, whose 
minimum corresponds to the stable equilibrium of the system in an external field H ex t, is 


# = fg(r)df, with g(r) = w(r)-H ext -B, 

v 


(5.129) 


the expression to be compared with Eq. (3.84). Similarly, for a system with linear magnetics, the latter 
of these expressions may be integrated over the variations to give 


54 From the standpoint of mathematics, this happens because the solution to a boundary problem is determined by 
not only the differential equation inside the system (in our case, the Laplace equation for potential but also by 
boundary conditions - which are affected by magnetics - see Eqs. (1 1 8)-(l 19). 

55 Note that in this respect, the analogy with electrostatics is incomplete. Indeed, according to Eq. (3.82), in 
electrostatics the role of a generalized coordinate is played by would-be field D, and that of the generalized force, 
by the real (average) electric field E. This difference may be traced back to the fact that electric field E may 
perform work on a moving charged particle, while the magnetic part of the Lorentz force (10), vxB, is always 
perpendicular to particle’s velocity, and its work equals zero. However, this difference does not affect the full 
analogy of expressions (3.79) and (127) for field energy density in linear media. 
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g(r) = — — B B- H ext • B = — (B - //H ext ) 2 + const , (5.130) 

2// 2// 

with similar consequences for the external magnetic field penetration into a system with magnetics. As a 
sanity check, for a uniform system with negligible fringe fields, such as a long solenoid filled with a 
uniform, linear magnetic material, Eq. (130) may be readily integrated over the sample volume to give 

#(r) = T^-(B-//H ext ) 2 F + const, (5.131) 

2 // 

so that the minimum of the Gibbs potential energy, i.e. the stable equilibrium of the system, corresponds 
to the result that has already been derived in the beginning of this section: B = //H cxl , i.e. H = H ex t. 

For the important particular case of a long solenoid (Fig. 6a) filled with a linear magnetic 
material, we hay find field H from Eq. (117), just as we used Eq. (37) in Sec. 2 for finding B for a 
similar empty solenoid, getting 

H = In , and hence B = juln . (5. 132) 

Now we may plug this result into Eq. (127) to calculate the magnetic energy stored in the solenoid: 

2 2 

and then use Eq. (72) to calculate its self-inductance: 

L = Y = juii 2 IA 
I 2 12 

- as evident generalization of Eq. (75). This result explains why filling of solenoids with soft 
ferromagnets with // » // 0 is so popular in the electrical engineering practice, where large self- and 
mutual inductances are frequently needed in systems with size and/or weight restrictions. 

Now, let us use these two tools to discuss a curious (and practically important) approach to 
systems with ferromagnetic cores. First, let us find the magnetic flux O in a system with a relatively 
thin, closed magnetic core made of sections of (possibly, different) soft ferromagnets, with the cross- 
section areas Ak much smaller than the squared lengths 4 of the sections - see Fig. 15. 


(5.132) 

(5.133) 



Fig. 5.15. Deriving the “magnetic Ohm law” (135). 


If all /u/, » juo, virtually all field lines are confined to the interior of the core. Then, applying the 
macroscopic Ampere law (1 17) to contour C, which follows a magnetic field line inside the core (see the 
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dashed line in Fig. 15), we get the following approximate expression (exactly valid only in the limit 
jUk/jUo, lk/A k -» oo): 

f H,dl — = m ■ (5134) 

c k k Mk 

However, since the magnetic field lines stay in the core, the magnetic flux cp A . « B k A k should be the same 
(= cp) for each section, so that B k = Q>IA k . Plugging this condition into Eq. (134), we get 

Magnetic 
Ohm law 
and 
reluctance 

Note a close analogy of the first of these equations with the Ohm law for several resistors 
connected in series, with the magnetic flux playing the role of electric current, while the product NI, of 
the voltage applied to the resistor chain. This analogy is fortified by the fact that the second of Eqs. 
(135) is similar to the expression for resistance R = 1/oA of a long uniform conductor, with the magnetic 
permeability ju playing the role of the electric conductivity cr. (In order to sound similar, but still 
different from resistance R, parameter /f is called the reluctance .) This is why Eq. (135) is called the 
magnetic Ohm law, it is very useful for approximate analyses of systems like ac transformers, magnetic 
energy storage systems, etc. 

The role of the “magnetic e.m.f.” NI may be also played by a permanent-magnet section of the 
core. Indeed, for relatively low fields we may use the Taylor expansion of the nonlinear function B{H) 
near H- 0 to write 

dB 

B*+<u 0 M s+J u d H, M d =—\ H=0 ’ (5-136) 

aH 



(5.135) 


where M s is the spontaneous magnetization magnitude at H = 0, the + sign corresponds to two possible 
directions of the magnetization, and parameter fid is called the differential (or “dynamic”) penneability. 
Expressing H from this relation, and using it in one of components of the sum (134), we again get a 
result similar to Eq. (135) 


O = + 


(»L 

+ y kj 


k 


with 'K,, 


A-HRd 


(5.137) 


where Ih and A H are geometric dimensions of the hard-ferromagnet section, and product NI is replaced 
with its effective value 


(»L= + —MJ„. (5.138) 

Md 

This result may be used for a semi-quantitative explanation of the well-known short-range forces 
acting between permanent magnets (or between them and soft ferromagnets) at their mechanical contact 
(Fig. 16). 
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Fig. 5.16. Short-range interaction between magnets. 


Indeed, considering the free-space gaps between them as sections of the core (which is 
approximately correct, because due to the small gap thickness d the magnetic field lines cannot stray far 
from the contact area), and neglecting the reluctance ^of the bulk material (due to its larger cross- 
section), we get 


m oc 


f 2d l Y 

1 5 

v/h> Md ; 


(5.139) 


so that, according to Eq. (127), the magnetic energy of the system (disregarding the constant energy of 
the pennanent magnetization) is 


U oc 


( 2 d_ + ! A 

VMo MdJ 


B 2 oc 


r 2d_ + J_^ 


1 j 1 Mo , , 

oc , d 0 = -/«/. 

d + d 0 2 n d 


(5.140) 


Hence the magnet attraction force, 


F = 


dU | 
dd 


oc 


1 

((f + d 0 ) 


(5.141) 


2 

behaves almost as the divergence 1 Id truncated at a short distance do « /. Due to that truncation, the 
force is finite at d = 0; this exactly the force you need to apply to detach two magnets. 


Finally, let us discuss in brief a related effect in experiments with thin and long hard 
ferromagnetic samples - “needles”, like those used in magnetic compasses. Using the definition (108) of 
field H, the Maxwell equation (29) takes the form 

V-B = // 0 V-(H + M) = 0, (5.142) 


and may be rewritten as 

V H = -V M . 


(5.143) 


While this relation is general, it is especially convenient in hard ferromagnets, where M is virtually 
fixed by the saturation. Comparing this equation with Eq. ( 1 .27) for the electrostatic field, we see that 
the right-hand part of Eq. (143) may be considered as a fixed source of a Coulomb-like magnetic field. 

For example, let us apply Eq. (143) to a thin, long needle made of a hard ferromagnet (Fig. 17a). 
Inside the needle, M = M s = const, while outside it M = 0, so that the right-hand part of Eq. (143) is 
substantially different from zero only in two small areas at the needle’s ends, and on much larger 
distances we can use the following approximation: 

V H = -q m S { r - rj ) + q m S { r - r 2 ) , (5.155) 
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where ri, 2 are ends’ positions, and q m = M S A, with A being the needle’s cross-section area. This equation 
is completely similar to Eq. ( 1 .27) for the electric field created by two equal and opposite point charges. 
In particular, if two ends of two needles are hold at an intermediate distance r (A ' « r « l, where / is 
the needle length, see Fig. 17b), the ends interact in accordance with the magnetic Coulomb law 


F oc 




(5.156) 




Fig. 5.17. (a) “Magnetic charges” at the ends of a thin ferromagnetic needle and (b) the result of its breaking 
into two parts (schematically). 


The “only” (but conceptually, very significant!) difference with electrostatics is that the 
“magnetic charges” ±q m cannot be fully separated. For example, if we break a magnetic needle in the 
middle in at attempt to bring its two ends further apart, two new “charges” appear - see Fig. 17b. There 
are several solid state systems where more flexible structures, similar to the magnetic needles, may be 
implemented. First of all, certain (“type- 1 1 ”) superconductors may sustain so-called Abrikosov vortices - 
crudely, flexible tubes with field-suppressed superconductivity inside, each carrying one magnetic flux 
quantum Oo = ti! ne ~ 2xl0" 15 Wb - see Sec. 6.3. Ending on superconductor’s surface, these tubes let the 
magnetic field lines to spread into the surrounding space, essentially forming a magnetic tnonopole 
analog (of course, with an equal and opposite “monopole” on another end of the line). Such flux tubes 
are not only flexible but readily stretchable, resulting in several peculiar effects. 56 Another, recently 
found, examples of paired “monopoles” include spin chains in so-called spin ices - crystals with 
paramagnetic ions arranged into a specific (pyrochlore) lattice - such as dysprosium titanate DyoTfiCfi. 57 


5.7. Exercise problems 

5.1 . Two straight, parallel, long, plane, thin strips of width d, 
separated by distance d, are used to form a current loop - see Fig. on the 
right. Calculate the magnetic field in the plane located at the middle 
between the planes of the strips, assuming that current / is uniformly 
distributed across strip width. 


' / 



56 A detailed discussion of the Abrikosov vortices may be found, for example, in Chapter 5 of M. Tinkham, 
Introduction to Superconductivity, 2 nd ed., McGraw-Hill, 1996. 

57 See, e.g., L. Jaubert and P. Holdworth, J. Phys. - Cond. Matt. 23, 164222 (201 1) and references therein. 
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5.2 . For the system studied in the previous problem, but now only in the limit d « w, calculate: 

(i) the distribution of the magnetic field (in the simplest possible way), 

(ii) the vector-potential of the field, 

(iii) the force (per unit length) acting on each strip, and 

(iv) the magnetic energy and self-inductance of the system (per unit length). 

5.3 . Calculate the magnetic field distribution near the center of the system of 
two similar, plane, round, coaxial wire coils, fed by equal but oppositely directed 
currents - see Fig. on the right. 


5.4 . The two-coil-system, similar to that considered in the previous problem, 

carries equal and similarly directed currents - see Fig. on the right. Calculate what 

2 2 

should be the ratio d/R for the second derivative d BJdz at z = 0 to vanish. 58 


5.5 . Calculate the magnetic field distribution along the axis of a 
straight solenoid (see Fig. 6a, partly reproduced on the right) with a finite 
length 1, and round cross-section of radius R. Assume that the solenoid has 
many wire turns (N» 1) that are uniformly distributed along its length. 

5.6 . A thin spherical shell of radius R, with charge Q unifonnly distributed over its surface, 
rotates about its axis with angular velocity ox Calculate the distribution of the magnetic field 
everywhere in space. 

5.7 . A sphere of radius R, made of an insulating material with a uniform electric charge density 
p , rotates about its diameter with angular velocity ox Calculate the magnetic field distribution inside the 
sphere and outside it. 

5.8 . The reader is (hopefully :-) familiar with the classical Flail effect when it takes place in the 
usual rectangular Hall bar geometry - see the left panel of the Fig. below. However, the effect takes a 
different form in the so-called Corbino disk - see the right panel below. (Dark shading shows electrodes, 
with no appreciable resistance.) Analyze the effect in both geometries, assuming that in both cases the 
conductors are thin, planar, have a constant Ohmic conductivity a and charge carrier density n, and that 
the applied magnetic field B is unifonn and normal to conductors’ planes. 


58 Such system, producing a highly uniform field near its center, is called the Helmholtz coils, and is broadly used 
in physics experiment. 



Chapter 5 


Page 38 of 42 





Essential Graduate Physics 


EM: Classical Electrodynamics 




5.9 . * The simplest model of the famous homopolar motor 59 is a 
thin, round conducting disk, placed into a uniform magnetic field normal 
to its plane, and fed by dc current flowing from disk’s center to a sliding 
electrode (“brush”) - see Fig. on the right. 

(i) Express the torque, rotating the disk, via its radius 5?, magnetic 
field B, and current I. 

(ii) If the disk is allowed to rotate about its axis, and the motor is driven by a battery with e.m.f. 
V, calculate its angular velocity ox neglecting electric circuit’s resistance and friction. 

(iii) Now assuming that the current circuit (battery + wires + contacts + disk itself) has full 
resistance R, derive and solve the equation for the time evolution of ox and analyze the solution. 

5.10 . * Estimate the values of magnetic susceptibility due to 

(i) orbital diamagnetism, and 

(ii) spin paramagnetism, 

for a dilute medium with negligible interaction between molecular dipoles. 

Hints'. For task (i), you may use the classical model described by Eq. (114) (see Fig. 13), while 
for task (ii), assume the mechanism of ordering of spontaneous magnetic dipoles mo, similar to the one 
sketched for electric dipoles in Fig. 12b, with the magnitude of the order of the Bohr magneton // B - see 
Eq. (96). 

5.11 . * Use the classical picture of the orbital (“Larmor”) diamagnetism, discussed in Sec. 5.5 of 
the lecture notes, to calculate its (small) correction AB(0) to the magnetic field B, as felt by the atomic 
nucleus, modeling atomic electrons by a spherically-symmetric cloud with electric charge density p(r). 
Express the result via the value $fi) of the electrostatic potential of electrons’ cloud, and use this 
expression for a crude numerical estimate of the relative correction, AB(0)/B, for the hydrogen atom. 

5.12 . Current / is flows in a thin wire bent into a plane, round loop of radius R. Calculate the net 
magnetic flux through the whole plane in which the loop is located. 



59 It was invented by M. Faraday in 1821, i.e. well before his celebrated work on electromagnetic induction. The 
adjective “homopolar” refers to the constant “polarity” (sign) of the current; the alternative term is “unipolar”. 
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5.13 . Calculate the (self-) inductance of a toroidal 
solenoid (Fig. 6b) with the round cross-section of radius r ~ R 
(see Fig. on the right), filled with a material of magnetic 
permeability //, with many (N » 1, R/r) wire turns uniformly 
distributed along the perimeter. Check your results by analyzing 
the limit r«R. 


I 

J 

I 

i. ' 

z / 

r 


f A 


C ' 


k. J 

0 

W' 






Hint : You may like to use the following table integral: 60 




a + b-ST 

-Mr 


cU; = n 




for a > 1 . 


5.14. Prove that: 


(i) the self-inductance L of a current loop cannot be negative, and 

(ii) each mutual inductance coefficient L kk >, defined by Eq. 


(LkkLk’k) 112 . 


(60), cannot be larger than 


5.15 . A round cylindrical shell, made of a soft ferromagnet, is 
placed into a uniform external field Ho perpendicular to its axis - see 
Fig. on the right. Find the distribution of the magnetic field everywhere 
in the system, and discuss its efficiency as a “magnetic shield”. 



5.16 . A straight thin wire, carrying current /, passes parallel to 
the plane boundary between two uniform, linear magnetics - see Fig. on 
the right. Calculate the magnetic field everywhere in the system, and 
the force (per unit length) exerted on the wire. 




5.17 . Calculate the distribution of magnetic field around a sphere made of a hard ferromagnet 
with a permanent, unifonn magnetization M = const. 

5.18 . * A limited volume V is filled with a magnetic material with magnetization M(r). 

(i) Use Eq. (5.143) to write explicit expressions for the magnetic field and its potential, induced 
by the magnetization. 

(ii) Recast these expressions in forms convenient when M(r) = Mo = const inside volume V. 


5.19 . Use the results of the previous problem to calculate the 
distribution of the magnetic field along the axis of a straight 


60 See, e.g., MA (6.13). 
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permanent magnet of length 2/, with round cross-section of radius R, and uniform magnetization M 0 
parallel to the axis - see Fig. on the right. 

5.20 . A very broad film of thickness 2 1 is magnetized normally to its plane, with a periodic 
checkerboard pattern with square side a: 

(+l), if cos— cos— >0, 
a a 

(-l), if cos—— cos-^— < 0. 
a a 

Calculate the magnetic field distribution in space. 61 

5.21 . A flat end of a rod magnet, with cross-section area 
A, with saturated magnetization M s directed along rod’s length, 
is let to stuck to a plane surface of a large sample made of a soft 
ferromagnetic material with // » // 0 . Calculate the force 
necessary to detach the rod from the surface, if it is applied 
strictly perpendicular to the contact surface - see Fig. on the 
right. 

5.22 / Based on the discussion of the quadrupole electrostatic lens in Sec. 2.4 of the lecture 
notes, suggest permanent-magnet systems which may similarly focus particles moving close to system’s 
axis, and carrying: 

(i) an electric charge, 

(ii) no net electric charge, but a nonvanishing spontaneous magnetic dipole moment m. 

5.23 . A circular wire loop, carrying a fixed dc current, has been 
placed inside a similar but larger loop, carrying a fixed current in the same 
direction - see Fig. on the right. Use semi-quantitative arguments to analyze 
the mechanical stability of the coaxial, coplanar position of the inner loop 
with respect to its possible angular, axial, and lateral displacements, if the 
position of the outer loop is fixed. 


61 This problem is of an evident relevance for the perpendicular magnetic recording (PMR) technology, which 
presently dominates the high-density digital magnetic recording, with the density already approaching 1 Tb/in 2 . 




M | Z | <? = n z M(x,y), with M(x,y ) = M 0 x 
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Chapter 6. Time-Dependent Electromagnetism 

In this chapter discusses two major new effects that appear if the electric and magnetic fields are 
changing in time: the “electromagnetic induction ” of electric field by changing magnetic field, and the 
reciprocal effect of “displacement currents ” - the induction of magnetic field by changing electric field. 
These two phenomena, which make the time-dependent electric and magnetic fields inseparable, 
contribute to the system of four Maxwell equations, and make it valid for arbitrary electromagnetic 
processes. On the way, I will pause for a brief review of the electrodynamics of superconductivity, which 
(besides its own significance), provides a perfect platform for a discussion of the gauge invariance. 


6.1. Electromagnetic induction 

As Eqs. (5.36) and (5.109) show, in static situations (did t = 0) the Maxwell equations describing 
the electric and magnetic fields are independent, and are coupled only implicitly, via the continuity 
equation (4.5) relating their right-hand parts p and j. (In statics this relation imposes a restriction only on 
vector j.) In dynamics, when the fields change in time, the situation in different. 

Historically, the first discovered explicit coupling between the electric and magnetic fields was 
the effect of electromagnetic induction. 1 The summary of Faraday’s numerous experiments has turned 
out to be very simple: if the magnetic flux, defined by Eq. (5.65), 

® = \B n d 2 r, (6.1) 

s 


Faraday 

induction 

law 


through a surface S limited by contour C, changes in time by whatever reason (e.g., either due to a 
change of the magnetic field B, or contour’s motion, or its deformation), it induces an additional, vortex- 
like electric field E in< i, similar in its topology to the magnetic field induced by a current. The exact 
distribution of Ei nt j in space depends on system geometry details and may be rather complex, but its 
integral along the contour C, called the inductive electromotive force (e.m.fi), obeys a very simple 
Faraday induction law . 2 



(6.2) 


In is straightforward (and hence left for the reader’s exercise :-) to show that the e.m.f. may be 
measured, for example, either inserting a voltmeter into a conducting loop following contour C, or by 
measuring current / = V mA /R it induces in a thin wire with Ohmic resistance R, whose shape follows that 
contour. The minus sign in Eq. (2) corresponds to the so-called Lenz rule: the magnetic field of the 
induced Ohmic current provides a partial compensation of the change of the original ® in time. 


In order to recast Eq. (2) in a differential form, let us apply, to the above definition of V ini , the 
same Stokes theorem that was repeatedly used in Chapter 5. 3 The result is 


1 The induction e.m.f. was discovered independently by J. Henry and M. Faraday, but is was a brilliant 
experiment series of the latter physicist, carried out in 1831, which resulted in this general formulation of the law. 

2 In Gaussian units, the right-hand part of this formula has the additional coefficient 1/c. 

3 If necessary, see MA Eq. (12.1) again. 
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V. 


ind 


= J(VxE ind \d 2 r. 


S 


(6.3) 


Now combining Eqs. (l)-(3), for a contour C whose shape does not change in time (so that the 
integration along it is interchangeable with the time derivative), 4 we get 

d 2 r = 0. (6.4) 

S v s n 




Since the induced electric field is additional to the field (1.33) created by electric charges, for the 
net field we should write E = E mc i - V<f > . However, since curl of any gradient field is zero, 5 Vx(V^) = 0, 
Eq. (4) is valid for the net field E. Since this equation should be correct for any closed area S, we may 
conclude that 


VxE + 


8B 

dt 


= 0 


(6.5) 


at any point. This is the final (time-dependent) form of this Maxwell equation. Superficially, it may look 
that Eq. (5) is less general than Eq. (2); for example that it does not describe any electric field, and 
hence any e.m.f. in a moving loop, if field B is constant in time, so that flux (1) does change in time. 
However, this is not true; in Chapter 9 we will see that in the reference frame moving with the loop such 
e.m.f. does appear. 


Now let us re-formulate Eq. (5) in terms of the vector-potential. Since the induction effect does 
not alter the fundamental relation V • B = 0, we still may present the magnetic field as prescribed by Eq. 
(5.27), i.e. as B = V x A. Plugging this expression into Eq. (6), we get 


Vx 


E + 


8A 

dt , 


= 0 . 


(6.6) 


Hence we can use the argumentation of Sec. 1.3 (there applied to vector E alone) to present the 
expression in parentheses as -V <j>, so that 



V^. 


(6.7) 


It is tempting to interpret the first term of the right-hand part as describing the electromagnetic 
induction alone, and the second term representing a purely electric field induced by electric charges. 
However, the separation of these two terms is, to a certain extent, conditional. Indeed, let us consider the 
gauge transformation already mentioned in Sec. 5.2, 


A -> A + Vj, 


(6.8) 


4 Let me admit that from the beginning of the course, I was carefully sweeping under the rug a very important 
question: in what exactly reference frame(s) all the equations of electrodynamics are valid? I promise to discuss 
this issue in detail later in the course (in Chapter 9), and for now would like to get away with a very short answer: 
all the formulas discussed so far are valid any inertial reference frame, as defined in classical kinematics - see, 
e.g., CM Chapter 1. It is crucial, however, to have fields E and B measured in the same reference frame. 

5 See, e.g., MA Eq. (11.1). 
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that, as we already know, does not change the magnetic field. According to Eq. (8), in order to keep the 
full electric field intact ( gauge-invariant ) as well, the scalar electric potential has to be transformed 
simultaneously, as 

(6.9) 

ot 

leaving the choice of a time-independent addition to (f> restricted only by the Laplace equation - since 
the full (j) should satisfy the Poisson equation (1.41) with a gauge-invariant right-hand part. We will 
return to the discussion of gauge invariance in Sec. 3. 

Now let us discuss whether Eqs. (2) or (5) describing the electromagnetic induction represent 
some completely new facts, on top of all the equations of electrostatics and magnetostatics, discussed in 
previous five chapters. The answer is not. To demonstrate that, let us consider a thin wire loop with 
current /, placed in a magnetic field (Fig. 1). According to Eq. (5.21), the magnetic force exerted by the 
field upon a small fragment of the wire is 

r/F = I (dr x B) = -/(B x dr ) , (6.10) 

where dr is a small vector, tangential to loop’s contour and directed along current I. Now let the wire be 
slightly (and slowly) deformed so that this particular fragment is displaced by a small distance Sr. (Let 
me hope that Fig. 1 makes the difference between the elementary vectors dr and Sr absolutely clear.) 



Since the wire’s acceleration (if any) is negligibly small, external (non-magnetic) forces should 
balance force (10), i.e. provide an equal and opposite force. This is why the work of these external 
forces at the displacement Sr, i.e. the change of the magnetic field energy U, is, 

S(dU) = -d¥ -Sr = ISr -(B xrfr) . (6.11) 

Let us apply to this mixed product the general operand rotation rule of the vector algebra, 6 so that vector 
B comes out of the vector product: 

S(dU) = lB-(drxSr). (6.12) 

But the magnitude of this vector product is nothing more than the area S(d r ) = S(dS) swept by the 
wire’s fragment at the deformation (Fig. 1), while its direction is perpendicular to this elementary area 
dS, along the “proper” nonnal vector n = ( dr/dr)x(Sr/ Sr ). The scalar multiplication of B by this vector is 


6 See, e.g., MA Eq. (7.6). 
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equivalent to taking its normal component. Hence, integrating Eq. (12) over all the wire length, we get 
the following result for the total variation of the magnetic energy: 

8U = I§B n 8(d 2 r). (6.13) 

c 

If B does not change at the wire deformation, the variation sign may be moved out from the integral, and 
Eq. (13) yields 7 

SU = I8®, (6.14) 

where ® is the magnetic flux through the loop. 

Now let the work 8'^ = 8U, necessary for this energy change, to come from a generator of 
voltage V ex t, inserted somewhere in the loop. In order for the system to be in quasi-equilibrium, this 
voltage should counter-balance the electromagnetic induction’s e.m.f. Kui- Work of the voltage at 
transfer of charge 8Q = I8t, during elementary deformation’s duration 8t, is 

W = K xt SQ = -V md 8Q = -V mA ISt . (6.15) 

Comparing Eqs. (14) and (15), we arrive at the Faraday induction law (2). 

Moreover, some authors derive Eq. (2) in this way, implying that there is no new infonnation in 
the induction law at all. Note, however, that the simple derivation given above has used the assumption 
of magnetic field’s independence on the deformation. A removal of this limitation would require using 
the Lorentz field transform (which will be only discussed in Chapter 9), and a very careful 
argumentation to exclude a faulty logic loop, because the transform itself is typically derived from 
Maxwell equations - including Eq. (5) that we are trying to prove. Personally I am happy that Dr. 
Faraday did his thorough work so early, placing the electromagnetic induction law on a firm 
experimental basis. 


6.2. Quasistatic approximation and skin effect 

As we will see later in this chapter, the interplay of the electromagnetic induction with one more 
time-dependent effect (the so-called displacement currents), enables electromagnetic waves propagating 
with speed c = 1 /(sopd) in free space, and with a comparable speed v = 1 !{sp) “ in dielectric and/or 
magnetic materials. For the phenomena whose spatial scale is much smaller than the wavelength X = 
ItwIcd, the displacement current effects are negligible, and time-dependent phenomena may be described 
by using Eq. (6) together with three other macroscopic Maxwell equations in their unmodified form: 8 


Quasistatic 

approximation 


These equations define the so-called quasistatic approximation of electromagnetism, and are 
sufficient to describe many important phenomena. Let us use them first of all for an analysis of the so- 


7 Actually, Eq. (14) is just an integral version of Eq. (5.128). 

8 Actually, the absence of time-dependent corrections to other Maxwell equations in the quasistatic approximation 
should be considered as an additional experimental fact. 


VxE + — = 0, 
dt 

V x H = j, 

V-D = p. 

V B = 0. 
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called skin effect, the phenomenon of self-shielding of the alternating (ac) magnetic fields by currents 
flowing in a conductor. 

In order to form a complete system of equations, Eqs. (16) should be augmented by material 
equations describing the medium. Let us take them, for a conductor, in the simplest (and simultaneously, 
most common) linear and isotropic form: 

j = crE, B = //H . (6.17) 


If the conductor is unifonn, i.e. coefficients cr and ju are constant inside it, the whole system of 
equations ( 16)-( 17) may be reduced to a single equation. Indeed, a sequential substitution of these 
equations into each other yields: 


dB 

dt 


V x E = 


1 „ . 
— Vxj 

cr 


= — V 2 B. 

(JjU 


— — V x (V x H) 
cr 


- — Vx(VxB) = - — [V(V-B)-V 2 B] 

(JJU (JJU 


(6.18) 


Thus we have arrived, without any further assumptions, at a very simple partial differential 
equation. Let us use it to analyze the skin effect in the simplest geometry (Fig. 2a) when an external 
source (which, at this point, does not need to be specified) produces, near a plane surface of a bulk 
conductor, a spatially-uniform ac magnetic field H l0| (f) parallel to the surface. 



(b) 



H = 0 


Fig. 6.2. (a) Skin effect in the 
simplest, planar geometry, 
and (b) two Ampere contours 
for deriving the “microscopic” 
(contour C i) and the 
“macroscopic” (contour Ck) 
boundary conditions for H. 


Selecting the coordinate system as shown in Fig. 2, we may express this condition as 


H 


x=-0 


= H { 0 \t)n v . 


(6.19) 


The translational symmetry of our simple problem within the surface plane [v, z] implies that inside the 
conductor dtdy = dldz = 0 as well, and H = fi(x, t) n v even at x > 0, so that Eq. (18) for conductor’s 
interior is reduced to a differential equation for just one scalar function H(x, t) = B(x, t)/ju: 9 


dll 

dt 


1 d 2 H 
gju dx 2 


for x > 0 . 


(6.20) 


9 Due to the simple linear relation between fields B and H, it does not matter too much which of them is used for 
the solution of this problem. A slight preference is for H, due to the simplicity of the boundary condition (5.1 18). 
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This equation may be further simplified by noticing that due to its linearity, we may use the linear 
superposition principle for the time dependence of the field, 10 via expanding it, as well as the external 
field (19), into the Fourier series, 


H(x,t) = Y J H (0 (x)e- icot , for x. > 0, 

CO 

H (0) (t) = Yj H^e ~ i0}t , for x = 0, 


and arguing that if we know the solution for each frequency component, the whole field may be found 
through the elementary summation (17) of these solutions. For each single-frequency component, Eq. 
(21) is immediately reduced to an ordinary differential equation for the complex amplitude Ifjx): 


-icoH co 


I d 2 
a/j dx 2 


H, 


( 6 . 22 ) 


From the theory of linear differential equations we know that Eq. (22) has the following general 
solution: 


KX 


HJx) = H + e ++H_e 


K X 


(6.23) 


where constants k± are roots of the characteristic equation that may be obtained by substitution of any of 
these two exponents into the initial differential equation. For our particular case, the characteristic 
equation, following from Eq. (22), is 

-ioj = — (6.24) 

<JJU 


and its roots are complex constants 

K ± 


(- i JUGXj) 1 12 =± ^—jJr (jucoa ) 1 1 2 . 

v2 


(6.25) 


For our problem, the field cannot grow exponentially at x — » +oo, so that only one of the 
coefficients, namely H. corresponding to the decaying exponent, with Re k < 0 (i.e. k = k.), may be 
nonvanishing, so that H ( fx) = // f ,/0)exp | /ex] . In order to find the constant factor f-fi/O), we can integrate 
the Maxwell equation VxH = j along a pre-surface contour - say, contour C\ shown in Fig. 2b. The 
right-hand part’s integral is negligible, because j does not contain any “genuinely surface” currents, 

li 

localized at a depth much smaller than l/Re[-/r_]. As a result, we get the “microscopic” boundary 
condition similar to Eq. (5.118) for the stationary magnetic field, H v = const atx = 0, we get 

H(0,l)=H w (l\ U.H, t (0)=H^, (6.26) 


10 Another important opportunity to exploit the linearity of Eq. (6.20) (as well as any linear, homogeneous 
differential equation) is to use the spatial-temporal Green ’s function approach to explore the dependence of its 
solutions on various initial conditions. Unfortunately, because of lack of time, I have to leave such exploration for 
reader’s exercise. 

11 This common name is awkward, because Eq. (26) results from macroscopic Maxwell equations (16), but is 
justified as the counterpart to the “macroscopic” boundary condition (30), to be discussed in a minute. 
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Skin 

depth 


so that the final solution of the problem may be presented as 


H S X ) = H lo ] ex P 



exp/ - 1 


cot - 


8 , 


s J\ 


(6.27a) 


where constant 8 S is called the skin depth : 


S. =- 


1 

f 2 1 

Ml 

Re k 

y jucrco ) 



(6.27b) 


This solution describes the skin effect : the penetration of the ac magnetic field of frequency co 
into a conductor only to a depth of the order of 8 S . A couple of examples of the skin depth: for copper at 
room temperature, 8 S « I cm at the ac power distribution frequency of 60 Hz, and is of the order of just 
1 pm at a few GHz, i.e. at typical frequencies of cell phone signals and kitchen microwave magnetrons. 
For a modestly salted water, 8 S is close to 250 m at 1 Hz (with big implications for radio 
communications with submarines), and is of the order of 1 cm at a few GHz (explaining a nonuniform 
heating of a soup bowl in a microwave oven). 


In order to complete the skin effect discussion, let us consider what happens with the induced ac 
currents 12 and the electric field at this effect. When deriving our basic equation (18), we have used, in 
particular, relations j = VxH = //'V x B, and E = j Icr. Since a spatial differentiation of an exponent 
yield a similar exponents, the electric field and current density have the same spatial dependence as the 
magnetic field, i.e. penetrate inside the conductor by distances of the order of Sfco), but their vectors are 
directed perpendicularly to B, while still being parallel to the conductor surface: 13 


J'o ,(*)=*:-#» ton,, E (0 (x) = — H a (x)n z . (6.28) 

cr 

By the way, integrating the first of these relations with the help of Eq. (26a), we may find that 
the linear density J of the surface currents (measured in A/m), is simply and fundamentally related to 
the applied magnetic field: 

co 

= (6-29) 

o 

Since this relation does not have frequency-dependent factors, we may sum it up for all frequencies and 
get a universal relation 

J(f) = H^°\t)n : = E ( 0 ) (f)j-n r xn t )=H ( 0 ) (r)x(-n t ) = H ( 0 ) (f)x n , (6.30) 

where n = -n v is the outer normal to the surface - see Fig. 2b. This simple relation (whose last fonn is 
independent of the reference frame choice) is not occasional. Indeed, Eq. (30) may be readily obtained 
from the Ampere law (5.37) applied to a contour drawn around a fragment of the surface, but extending 
under it much deeper than the skin depth - see contour C 2 in Fig. 2b, regardless of the exact law of the 


12 They are frequently called eddy currents, because of the loop form of their lines. (In the ID geometry explored 
above these loops are implicit, closing at infinity.) 

13 Notice that vectors j and E are parallel, and have the same time dependence. This means that the time average 
of the power dissipation j • E is finite. We will return to its discussion later in this chapter. 
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field penetration. Relation (30) is frequently called the “macroscopic” boundary condition for the 
magnetic field near conductor’s surface, to distinguish it from the “microscopic” boundary condition 
(26). 

For the skin effect, the fundamental relation between the surface current density and the external 
magnetic field means that the effect implementation does not require a dedicated ac magnetic field 
source. For example, it takes place in any wire that carries ac current, and leads to current concentration 
in a surface sheet of thickness ~8 S . (Of course the quantitative analysis of this problem in a wire with an 
arbitrary cross-section may be technically complicated, because it requires to solve Eq. (18) for a 2D 
geometry; even for the round cross-section, the solution involves the Bessel functions.) In this case, the 
ac magnetic field outside the conductor, that still obeys Eq. (30), is better understood as the effect, rather 
than the reason, of the ac current flow. 


Finally, the reader should mind the validity limits of these results - besides the universal Eq. 
(30). First, in order for the quasistatic approximation to be valid, frequency co should not be too high, so 
that the skin depth (27) remains much smaller than the corresponding wavelength, 


2^v 
( o 


/ , \ 1/2 
f 4 n 2 A 


S/JCO ) 


(6.31) 


which decreases with co faster than S s (27b). Note that the crossover frequency (at which S s = X), 


a <7 

co,. = — = , 

£ £ r £ 0 


(6.32) 


is nothing else than the reciprocal charge relaxation time (4.10). As was discussed in Sec. 4.2, for good 

18 1 

metals this frequency is extremely high (about 10 s' ). 

A more practical upper limit on co is that the skin depth 8 S should stay much larger than the mean 
free path / of charge carriers. 14 Beyond this point, a non-local relation between vectors j(r) and E(r) 
becomes essential. Both theory and experiment show that at 8 S < /, the skin effect still persists, but 

1 /o 

acquires a slightly different frequency dependence, 8, oc co . Such anomalous skin effect has useful 
applications, for example, for experimental measurements of the Fenni surface in metals. 15 


6.3, Electrodynamics of superconductivity and gauge invariance 

The effect of superconductivity 16 takes place when temperature T is reduced below a certain 
critical temperature (T c ), specific for each material. For most metallic superconductors, T c is of the 
order of typically a few kelvins, though several exotic compounds (the so-called high-temperature 
superconductors) with T c above 100 K have been found since 1987. The most notable property of 
superconductors is the absence, at T < T c , of measurable resistance to not very high dc currents. 


14 A brief discussion of the mean free path may be found, for example, in SM Chapter 6. In very clean metals at 
low temperatures, 8 S may approach / at frequencies as low as ~1 GHz, though at room temperature the crossover 
from the normal to the anomalous skin affect takes place at ~ 100 GHz. 

15 See, e.g., A. Abrikosov, Introduction to the Theory’ of Normal Metals, Academic Press, 1972. 

16 Discovered experimentally in 1911 by H. Kamerlingh Onnes. 
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However, electromagnetic properties of superconductors cannot be described by just taking o = 
oo in our previous results. Indeed, for this case, Eq. (27b) would give 8 S = 0, i.e., no ac magnetic field 
penetration at all, while for the dc field we would have the uncertainty oca = ? Experiment shows 
something substantially different: weak magnetic fields do penetrate into superconductors by a material- 
specific London penetration depth 8 f ~ 10~ 7 -10' 6 m, 17 which is virtually frequency-independent until the 
skin depth 8 S , measured in the same material in its “normal” state, i.e. the absence of superconductivity, 
becomes less than 8l- (This crossover happens typically at frequencies ~ 10 13 s" 1 .) The smallness of Sl 
means that the magnetic field is pushed out of macroscopic samples at their transition into the 
superconducting state. 

This Meissner-Ochsenfeld effect, discovered experimentally in 1933, 18 may be partly understood 
using the following classical reasoning. When we discussed the physics of conductivity in Sec. 4.2, we 
implied that the current (and electric field) frequency co is either zero or sufficiently low. In the classical 
Drude reasoning (see Sec. 4.2), this is acceptable while cox « 1, where r is the effective carrier 
scattering time participating in Eqs. (4.12)-(4.13). If this condition is not satisfied, we should take into 
account the charge carrier inertia; moreover, in the opposite limit ojt» 1 we may neglect the scattering 
at all. Classically, we can describe the charge carriers in such a “perfect conductor” as particles that are 
accelerated by the electric field in accordance with the 2 nd Newton law (4. 1 1) at all times, 

v = — F = ^-E, (6.33) 

m m 


so that the current density j = qm they create changes in time as 


i=— e. 

m 


In terms of the Fourier amplitudes (see the previous section), this means 


m co = E co- 

rn 


(6.34) 


(6.35) 


Comparing this formula with the relation j® = crE f „ implied in the last section, we see that we can use all 
its results with the following replacement: 


. q~n 

a — > i . 

mco 


(6.36) 


This change replaces the characteristic equation (24) with 

K 2 m<x> 

- 1(0 = — ; , 

rq n/a 


(6.37) 


i.e. replaces the skin effect with the field penetration by the following frequency-independent depth: 


17 Named to acknowledge the pioneering theoretical work of brothers F. and H. London - see below. 

18 It is hardly fair to shorten the name to just the “Meissner effect”, as it is frequently done, because of the 
reportedly crucial contribution made by R. Ochsenfeld, then W. Meissner’s student, into the discovery. 


Chapter 6 


Page 9 of 32 





Essential Graduate Physics 


EM: Classical Electrodynamics 


8 = 


r y /2 

m 


(6.38) 


M n J 

Superficially, this means that the field decay into the superconductor does not depend on frequency: 


H(x, t ) = H( 0, t) exp j - — k 


(6.39) 


explaining the Meissner-Ochsenfeld effect. 


However, there are two problems with this result. First, for the parameters typical for good 
metals ( q = -e,n ~ 10 m' , m ~ m e , p~ po), Eq. (38) gives 8 ~ 10' m, a factor of 10-10 lower than the 
typical experimental values of c5L. Experiment also shows that the penetration depth diverges at T — > T c , 
which is not predicted by Eq. (38). Another, much more fundamental problem with Eq. (38) is that it has 
been derived for cor » 1 . Even is we assume that somehow there are no collisions at all, i.e. r = oo, at 
co — > 0 both parts of the characteristic equation (37) vanish, and we cannot make any conclusion about k. 
This is not just a mathematical artifact we could ignore. For example, let us place a non-magnetic metal 
at T > T c into a static external magnetic field. The field will completely penetrate into the sample. Now 
let us cool it. As soon as the temperature drops below T c , our calculations become valid, forbidding the 
penetration into the superconductor of any change of the field, so that the initial field would be “frozen” 
inside the sample. The experiment shows something completely different: as T is lowered below T c , the 
initial field is being pushed out of the sample. 


The resolution of these contradictions has been provided by quantum mechanics. As was 
explained in 1957 in a seminal work by J. Bardeen, L. Cooper, and J. Schrieffer (commonly referred to 
the BSC theory), superconductivity is due to the correlated motion of electron pairs, with opposite spins 
and nearly opposite momenta. Such Cooper pairs, each with the electric charge q = -2e and zero spin, 
may form only in a narrow energy layer near the Fermi surface, of certain thickness A(T). Parameter 
A(7), which may be also considered as the binding energy of the pair, tends to zero at T — » T c , while at T 
« T c it has a virtually constant value A(0) « 3.5 kCT c , of the order of a few meV for most 
superconductors. This fact readily explains the relatively low spatial density of the Cooper pairs: n p {T) ~ 
nA(T)/s ¥ ~ 10“ m' . With the correction n — > n p , our Eq. (38) for the penetration depth becomes 



( \ 

m 

1/2 

y m 2 n p {T ), 



(6.40) 


This expression diverges at T — > T c , and generally fits the experimental data reasonably well, at least for 
the so-called “clean” superconductors (with the mean free path / = vt much longer that the Cooper pair 
size see below). 


The smallness of the coupling energy A(7) is also a key factor in the explanation of the 
Meissner-Ochsenfeld effect, as well as several macroscopic quantum phenomena in superconductors. 
Because of Heisenberg’s quantum uncertainty relation 8r8p ~ h, the Cooper-pair size (the so-called 
coherence length) is relatively large: 8, ~ Sr ~ hi 8p ~ hvylAiT) ~ 10' m. As a result, n p g » 1, meaning 
that Cooper pairs are strongly overlapped in space. Now, due to their integer spin, Cooper pairs behave 
like bosons, which means in particular that at low temperature they exhibit the so-called Bose-Einstein 


London 

penetration 

depth 
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condensation onto the same energy level. 19 This means that the frequency co = E/ti of the time evolution 
of each pair’s wavefunction V F = y/ex\){-icot\ is the same, i.e. that the phases cp of the wavefunctions, 
defined by equation 

y/ = \y/\e l(p , (6.41) 

become equal, so that the current is carried not by individual Cooper pairs but rather their Bose-Einstein 
condensate described by a single wavefunction. Due to this coherence, the quantum effects (which are, 
in usual Fermi-liquids of single electrons, masked by the statistical spread of phases cp), become very 
explicit - “macroscopic”. 

To illustrate this, let us write the well-known quantum-mechanical formula for the probability 
current of a free, nonrelativistic particle, 20 

ip = 7 E~(v / Vv / * - c -c.)= V - c.c.]. (6.42) 

2 m 2 m 

Now let me borrow one result that will be proved later in the course (in Sec. 9.7) when we discuss the 
analytical mechanics of a charged particle moving in an electromagnetic field. Namely, in order to 
account for the magnetic field effects, particle’s kinetic momentum p, equal to mx (where v = dr/dt is 
particle’s velocity) has to be distinguished from its canonical momentum, 21 

Psp + ^A. (6.43) 

where A is the vector-potential of the field - see Eq. (5.27). In contrast with Cartesian components pj = 
muj of momentum p, the canonical momentum components are the generalized momenta corresponding 
to components r 7 of the radius-vector r, considered as generalized coordinates of the particle: Pj = 
ddddvj, where / is the particle’s Lagrangian function. According to the general rules of transfer from 
classical to quantum mechanics, 22 it is vector P whose operator (in the Schrodinger picture) equals -iW, 
so that the operator of kinetic momentum p = P - qA is equal to -z/zV - qA. Hence, the in order to 
account for the magnetic field effects, we should make the following replacement: 

- zTzV — > -ifN - qA . (6.44) 

In particular, Eq. (42) has to be replaced with 

\ p = IhV - qA)yr - c.c.]. (6.45) 

2 m 

This expression becomes more transparent if we take the wavefunction in form (41): 


19 A qualitative discussion of the Bose-Einstein condensation of bosons may be found in SM Sec. 3.4, though the 
full theory of superconductivity is more complex, because it describes the condensation taking place 
simultaneously with the formation of effective bosons (Cooper pairs). For a more detailed coverage of physics of 
superconductors, the reader may be referred, for example, to the already cited monograph by M. Tinkham, 
Introduction to Superconductivity, 2 nd ed., McGraw-Hill, 1996. 

20 See, e.g., QM Sec. 1.4, in particular Eq. (1.47). 

21 I am sorry to use traditional notations p and P for the momenta - the same symbols which were used for the 
electric dipole moment and polarization in Chapter 3. I hope there will be no confusion, because the latter notions 
are not used in this section. 

22 See, e.g., CM Sec. 10.1, in particular Eq. (10.26). 
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h , ,2 

i P = ~W I 


m 


V 


V(p-d- A 

n j 


(6.46) 


This relation means, in particular, that in order to keep j invariant, the gauge transformation (8)-(9) has 
to be accompanied by a simultaneous transformation of the wavefunction phase: 


<P^><P + • 

n 


(6.47) 


It is fascinating that the quantum-mechanical wavefunction (more exactly, its phase) is not gauge- 
invariant - meaning that you may change it in your mind - at will! Again, this does not change any 
observable (such as j or the probability density y/y/*), i.e. any experimental results. 


For the electric current density of the whole superconducting condensate, Eq. (46) yields 


j 


hqnjj) 

m 


V(p -^~ A . 

I n J 


(6.48) 


This equation shows that this supercurrent may be induced by dc magnetic field alone and does not 
require any electric field. Indeed, for the simplest, ID geometry shown in Fig. 2a, j(r) =j(x) n z , A(r) = 
A(x) 11-, and d/d z = 0, so that the Coulomb gauge condition (5.48) is satisfied for any choice of the gauge 
function %[x), and for the sake of simplicity we can choose it to provide qAy) = const, 23 so that 


j 


q 2 n 


,(T) 


A. 


m 


(6.49) 


This is the so-called London equation, proposed (in a different form) by brothers F. and H. 
London in 1935 for a phenomenological description of the Meissner-Ochsenfeld effect. Combining it 
with Eq. (5.47), generalized for an arbitrary uniform media by the replacement // 0 — » //, we get 


V 2 A 


m 2n p{T) x 

m 


(6.50) 


This simple differential equation, similar in structure to Eq. (18), has a similar exponential solution, 


A(x) = A( 0) exp 



B(x ) = 5(0) exp 



j(x) = y(0)exp 



(6.51) 


that shows that the magnetic field and supercurrent penetrate into a superconductor only by the 
London’s penetration depth St, given by Eq. (40), regardless of frequency. 24 By the way, integrating the 
last result through the penetration layer, and using Eqs. (34), (43) and the vector-potential definition, B 
= VxA (for our geometry, giving B(x) = dA(x)/dx = -dA(x)) we may check that the linear density J of 
the surface supercurrent still satisfies the universal relation (30). 


23 This is the so-called London gauge which, for our geometry, is also the Coulomb gauge. 

24 Since not all electrons of a superconductor form Cooper pairs, at any frequency co^O they provide Joule losses 
which are not described by Eq. (48). These losses become very substantial when frequency co becomes so high 
that the skin-effect length 5 S of the material (as measured with superconductivity suppressed, say by high 
magnetic field) becomes less than Sl. For typical metallic superconductors, this happens at frequencies of a few 
hundred GHz, so that even for microwaves, Eq. (51) gives a fairly good description of the field penetration. 


Supercurrent 

density 
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Let me hope that the physical intuition of the reader enables him or her to make the following 
semi-quantitative generalization of the quantitative solution (51) to superconductor sample of arbitrary 
shape: B and j may only penetrate into the sample by distances of the order of <5i(0). In particular, for 
samples much larger than Sl, the London theory gives the following “macroscopic” description of the 
Meissner-Ochsenfeld effect: j = 0 and B = 0 everywhere inside a superconductor. In this coarse 
description, the bulk superconductor sample behaves as an ideal diamagnet, with ju = 0. 25 In particular, 
we can use this analogy and the first of Eqs. (5.125) to immediately obtain the magnetic field 
distribution outside a superconducting sphere: 


B = /AjH = -// 0 V (j) m , 


K = H c 


R 


3 A 


— r — - 


2 r“ 


cos 0. 


(6.52) 


Figure 3 shows the corresponding surfaces of equal potential (j> m . It is evident that the magnetic 
field lines (normal to the equipotential surfaces) bend to become parallel to the superconductor’s 
surface. By the way, this pattern illustrates the answer to the question that might arise at making 
assumption (19): what happens to superconductors in a normal magnetic field? The answer is: the field 
is defonned outside the superconductor to provide B n = 0 at the surface - otherwise, due to the continuity 
of B n , the magnetic field would penetrate the superconductor, which is impossible. Of course this answer 
should be taken with a grain of salt: strong magnetic fields do penetrate into superconductors, destroying 
superconductivity (completely or partly), thus violating the Meissner-Ochsenfeld effect. Such a 
penetration by itself features several interesting electrodynamic effects, for whose discussion we 
unfortunately do not have time. 26 


A 


H 


o 




Fig. 6.3. Surfaces of constant scalar 
potential (j) m of magnetic field 
around a superconducting sphere of 
radius R » Sl, placed into a weak 
uniform, vertical magnetic field. 


6.4. Electrodynamics of macroscopic quantum phenomena 

We have seen that for the ac magnetic field penetration, the quantum theory of superconductivity 
gives essentially the same result as the classical theory of a perfect conductor - cf. Eqs. (39) and (51) - 
with the “only” conceptual exception that the former theory extends the effect to dc fields. However, the 


25 Of course, this analogy sweeps under the rug the real physics of the Meissner-Ochsenfeld effect. In particular, 
in superconductors the role of the surface “magnetization currents” with effective density j e f = VxM (see Fig. 5.11 
and its discussion) is played by the real, persistent surface supercurrents (48). 

26 The interested reader may be referred, e.g., to Chapter 5 of M. Ti nk ham’s monograph cited above. 
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quantum theory of superconductors is much more rich. For example, let us use Eq. (48) to derive the 
fascinating effect of magnetic flux quantization. Consider a closed ring made of a superconducting 
“wire” with a cross-section much larger than <f (Fig. 4a). 



Fig. 6.4. (a) Closed, flux-quantizing superconducting ring, (b) a ring cut with a narrow slit, 
and (c) a Superconducting QUantum Interference Device (SQUID). 

From the last section’s analysis, we know that deep inside the wire the supercurrent is 
exponentially small. Integrating Eq. (48) along any closed contour C that does not approach the surface 
closer than a few Sl at any point, we get 



c 



A • dr 


0 . 


(6.53) 


The first integral, i.e. the difference of cp in the initial and final points, has to be equal to either zero or 
an integer number of 2n, because the change y? — » cp+ 27m does not change condensate’s wavefunction: 

y/' = \i//\e i ^ <p+2m ^ = \y/\e i<p =y/ . (6.54) 


On the other hand, the second integral in Eq. (53) is just the magnetic flux O (1) through the contour - 
and, due to the Meissner-Ochsenfeld effect, through the superconducting ring as a whole. As a result, we 
get 


2 7ih 

h 

® = /7® 0 , ® 0 = 

= — , n = 0, ±1, ±2,..., 

q 

q 


(6.55) 


i.e. the magnetic flux can only take values multiple of the flux quantum ®o. This effect, predicted in 
1950 by the same Fritz London (who expected q to be equal to the electron charge -e), was confirmed 
experimentally in 1961, 27 with \q\ = 2e (so that in superconductors ®o = hi 2e ~ 2.07xl0' 15 Wb). 
Historically, this observation gave a decisive support to the BSC theory of the Cooper pairs as the basis 
of superconductivity, which had been put forward just 4 years before. 28 


27 Independently and virtually simultaneously by two groups: B. Deaver and W. Fairbank, and R. Doll and M. 
Nabauer, so that their reports were published back-to-back in Phys. Rev. Lett. 

28 Actually, the ring is not entirely necessary. In 1957, A. Abricosov used the Ginzburg-Landau equations (see 
below) to explain the counter-intuitive behavior of the so-called type-II superconductors, known experimentally 
as the Shubnikov phase since the 1930s. He showed that high magnetic field may penetrate into such 
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flux 

quantization 
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The flux quantization is just one of the so-called macroscopic quantum effects in 
superconductivity. Consider, for example, a superconducting ring interrupted with a very narrow slit 
(Fig. 4b). Integrating Eq. (48) along the current-free path from point 1 to point 2, along the dashed line 
in Fig. 4 (again, deeper than 5fT) from the surface), we get 




• dr 


<Pi ~<P\ -f-' $• 
n 


Using the flux quantum definition (55), this result may be rewritten as 

Josephson 
phase 
difference 


27t _ 

(P = (Px~(Pi =— ® 

^ n 


(6.56) 


(6.57) 


where cp is called the Josephson phase difference. In contrast to each of the phases cp\^, their difference 
cp is gauge-invariant, because it is directly related to the gauge-invariant magnetic flux. 

Can this tp be measured? Yes, using the Josephson effect. 19 In order to understand his prediction, 
let us take two (for the argument simplicity, similar) superconductors, connected with some sort of weak 
link, for example a tunnel barrier or a short normal-metal bridge, through that a small Cooper pair 
current can flow. (Such system of two coupled superconductors is now called a Josephson junction .) Let 
us think what this supercurrent I may be a function of. For that, the reverse thinking is helpful: let us 
imagine we can change current from outside; what parameter of the superconducting condensate can it 
affect? 


If the current is weak, it cannot perturb the superconducting condensate density, proportional to 
\f\ ; hence it may only change the Cooper condensate phases (p\i. However, according to Eq. (41), the 
phases are not gauge-invariant, while the current should be, hence I may affect - or should be a function 
of - the phase difference tp defined by Eq. (57). Moreover, just has already been argued during the flux 
quantization discussion, a change of any of (p\p (and hence of cp) by 2^ or any of its multiples should not 
change the current. In addition, if the wavefunction is the same in both superconductors {cp = 0), 
supercurrent should vanish due to the system symmetry. Hence function /( cp) should satisfy conditions 


7(0) =0, I(<p + 2 7f) = I((p). 


(6.58) 


Josephson 

supercurrent 


With this understanding, we should not be terribly surprised by the following Josephson’ s result that for 
the weak link provided by weak tunneling, 30 


I(<P) = I C sin cp, 


(6.59) 


superconductors, whose coherence length c is smaller than the London’s penetration depth SJT), in the fonn of 
self-formed tubes surrounded by vortex-shaped supercurrents - the so-called Abrikosov vortices, with the 
superconductivity suppressed near the middle of each tube. This suppression makes each flux tube topologically 
equivalent to a superconducting ring, with the magnetic flux through it equal to one flux quantum, and its ends 
being magnetically similar to magnetic monopoles - see Sec. 5.6 above. 

29 It was predicted in 1961 by B. Josephson (then a PhD student!), and observed experimentally by several groups 
soon after that. 

30 For some other types of weak links, function I{(p) may deviate from the sine form (59) rather considerably, still 
satisfying the general requirements (58). 
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where constant I c , which depends on of the strength of the weak link and temperature, is called the 
critical current. 


Let me show how such expression may be derived, for a narrow and short weak link made of a 
nonnal metal or a superconductor. 31 Microscopic theory of superconductivity shows that, within certain 
limits, the Bose-Einstein condensate of Cooper pairs may be described by the following nonlinear 
Schrodinger equation 32 


— ihV - qA)~ y/ + U{r \ / = sy/ + yr x (a nonlinear function of \y/\ ). 
2m v ' 


(6.60) 


The first three terms of this equation are similar to those of the usual Schrodinger equation (which 
conserves the number of particles), while the nonlinear function in the last term describes the formation 
and dissolution of Cooper pairs, and in particular gives the equilibrium value of n s as a function of 
temperature. Now let the weak link size scale a be much smaller than both the Cooper pair size E, and the 
London’s penetration depth 8 f. The first of these relations ( a « E) makes the first tenn in Eq. (60), that 
scales as 1/a", much larger than all other terms, while the latter relation (a « 88) allows one to neglect 
magnetic field effects, and hence drop term (-qA) from the parenthesis in Eq. (60), reducing it to just our 
familiar Laplace equation for the wavefunction: 

V> = 0. (6.61) 


Since the weak coupling cannot change \ y/\ in bulk superconducting electrodes, Eq. (61) may be solved 
with the following simple boundary conditions: 


vir) -a 




for r — » r, , 
for r — » r 2 , 


(6.62) 


where iq and r 2 are some points well inside the corresponding superconductors, i.e. at distances much 
larger than a from the weak link center. It is straightforward to verify that the solution of this boundary 
problem for complex function y/ may be expressed as follows, 

H r) = \y/\e ,<Pl f{r) + \y/\e t<Pl (l - /(r)) , (6.63) 


via the real functional - ) that satisfies the Laplace equation and the following boundary conditions: 

1, for r 


m 


‘i’ 


0, for r — » r 


(6.64) 


2 - 


Function /(r) depends on the weak link geometry and may be rather complicated, but we do not 


31 This derivation belongs to L. Aslamazov and A. Larkin, JETP Lett. 9, 87 (1969). If the reader is not interested 
in this topic, he or she may safely skip it, jumping directly to the text following Eq. (68). 

32 At r — > T c , where n s — > 0, the Taylor expansion of the nonlinear function in Eq. (60) may be limited to just one 
term proportional to \yh\ 2 cc n s . In this limit, Eq. (60) is called the Ginzburg-Landau equation - see SM (4.58). 
Derived by V. Ginzburg and L. Landau in 1950 from phenomenological arguments (see, e.g., SM Sec. 4.3) , i.e. 
before the advent of the BSC theory, this simple equation, solved together with Eq. (48) and the Maxwell 
equations, may describe a very broad range of macroscopic quantum effects including the Abrikosov vortices, 
critical fields and currents, etc. - see, e.g., M. Tinkham’ monograph cited above. 
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Macroscopic 

quantum 

interference 


need to know it to get the most important result. Indeed, plugging this solution into Eq. (48) (with term - 
qA ignored as being negligibly small), we get 


j p 


- — ta/j V/siny?, so that j = 
m 


r ) 

m 


V/ sin cp . 


(6.65) 


Integrating this relation over any cross-section S of the weak link, we arrive at Josephson’s result (59), 
with the following critical current: 

*qn p {T) 

1 c 

m 


J(v f)„d 2 r. (6.66) 


This expression may be readily evaluated via the resistance of the same weak link in the 
“normal” (non-superconducting) state, say at T > T c . Indeed, as we know from Sec. 4.3, the distribution 
of the electrostatic potential (j) at normal conduction also obeys the Laplace equation, with boundary 
conditions that may be taken in the form 


^(r) — > 



for r — > r, , 
for r — » r 2 , 


(6.67) 


Comparing the boundary problem for (jdj) with that for functional - ), we get <f>= Vf This means that the 
gradient V/, which participates in Eq. (66), is just (-E IV) = (-j /oV). Hence the integral in that formula is 
just -I/<jV= -\/<jR n , where R„ is the resistance of the Josephson junction in its normal state. As a result, 
Eq. (66) yields 


fiqn p {T) j 
m cr R n 


( 6 . 68 ) 


showing that the I c R n product does not depend on the junction geometry, though it does depend on 
temperature, vanishing, together with n p (T), at T — > T c . (Well below the critical temperature, I c R n of a 
sufficiently short weak links is of the order of A(0 )/e, i.e. of the order of a few mV.) 


ring 


Now let us see what happens if a Josephson junction is placed into the gap in a superconductor 
see Fig. 4c. In this case, we can combine Eqs. (57) and (59), getting 


I = I c sin 27V 



(6.69) 


This effect of periodic dependence of the current on flux is called the macroscopic quantum 
interference , 33 while the system shown in Fig. 4b, a superconducting quantum interference device, 
abbreviated as SQUID (with all letters capital, please :-). The low value of the magnetic flux quantum 
®o, and hence the high sensitivity of cp to the magnetic field, allows using SQUIDs as ultrasensitive 
magnetometers. Indeed, for a superconducting ring of area ~1 cm , one period of the change of 
supercurrent (69) is produced by magnetic filed change of the order of 10' 11 T (10" 7 Gs), while sensitive 
electronics allows to measure a tiny fraction of this period - limited by thermal noise at a level of the 


33 The name is due to the deep analogy between this phenomenon and the interference between two waves, to be 
discussed in detail in Sec. 8.4. 
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order of a few pT. This sensitivity allows measurements, for example, of the magnetic fields induced by 
the beating human heart, and even by brain activity, outside of the body. 

An important aspect of the quantum interference is the so-called Aharonov-Bohm (AB) effect? 4 
Let the magnetic field lines be limited to the central part of the SQUID ring, so that no appreciable 
magnetic field ever touches the superconducting ring material. (This may be done experimentally with 
very good accuracy, for example using high-// magnetic cores - see their discussion in Sec. 5.6.) As 
predicted by Eq. (69), and confirmed by several careful experiments carried out in the mid-1960s, 35 this 
restriction does not matter - the interference is observed anyway. This means that not only the magnetic 
field B, but also the vector-potential A represents physical reality, albeit quite a peculiar one - 
remember the gauge transformation? 

Actually, the magnetic flux quantization (55) and the macroscopic quantum interference (69) are 
not completely different effects, but just two manifestations of the whole group of inter-related 
macroscopic quantum phenomena. In order to show that, one should note that if critical current I c (or 
rather its product by loop’s self-inductance L ) is high enough, flux ® in the SQUID loop is due not only 
to the external magnetic field flux ® e , but also has a self-field component - cf. Eq. (5.6 1): 36 

® = ®ext - LI, where ® ext = J (. B ext ) n d 2 r . (6.70) 

5 

Now the relation between ® and ® ex t may be found by solving this equation together with Eq. (69). 
Figure 5 shows this relation for several values of the dimensionless parameter X= 2/zL/ r /®o- 



Fig. 6.5. Function ®(® ex t) for SQUIDs 
with various values of the normalized LI C 
product. Dashed arrows show the flux 
leaps as the external field is changed. (The 
branches with //®/d® ext < 0 are unstable.) 


34 For a more detailed discussion of the AB effect, which also takes place for single quantum particles, see, e.g., 
QM Sec. 3.2. 

35 Later, similar experiments were carried out with electron beams, and then even with “normal” (meaning non- 
superconducting) solid-state conducting rings. In this case, the effect is due to interference of the wavefunction of 
a single charged particle (an electron) with itself, and if of course is much harder to observe that in SQUIDs. In 
particular, the ring size has to be very small, and temperature low, to avoid “dephasing” effects due to 
unavoidable interactions of the particles with enviromnent. 

36 The sign before LI would be positive, as in Eq. (5.61), if / was the current flowing into the inductance. 
However, in order to keep the sign in Eq. (69) intact, / should mean the current flowing into the Josephson 
junction, i.e.from the inductance, thus changing the sign of the term. 
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These plots show that if the critical current or (or the inductance) is low, A « \, the self-effects 
are negligible, and the total flux follows the external field (i.e., O exl ) quite faithfully. However, at A > 1, 
the dependence 0(0 ex t) becomes hysteretic, and at A » 1 the positive-slope (stable) branches of this 
function are nearly flat, with the total flux values corresponding to Eq. (55). Thus, a superconducting 
ring closed by a high-/ c Josephson junction exhibits a nearly-perfect flux quantization. 

The self-field effects described by Eq. (70) create certain technical problems for SQUID 
magnetometry, but they are the basis for one more application of these devices: ultrafast computing. 
Indeed, Fig. 5 shows that at the values of A modestly above 1 (e.g., A « 3), and within a certain range of 
applied field, the SQUID has two stable flux states that differ by AO « Oo and may be used for coding 
binary 0 and 1. For practical superconductors (like Nb), the time of switching between these states (see 
dashed arrows in Fig. 4) are of the order of a picosecond, while the energy dissipated at such event may 
be as low as ~10' 19 J. (This bound is determined not by device’s physics, by the fundamental 
requirement for the energy barrier between the two states to be much higher than the thermal fluctuation 
energy scale kfT, ensuring a sufficiently long information retention time.) While the picosecond 
switching speed may be also achieved with some semiconductor devices, the power consumption of the 
SQUID-based digital devices may be 5 to 6 orders of magnitude lower, enabling VLSI circuits with 
100-GHz-scale clock frequencies and manageable power dissipation. Unfortunately, the range of 
practical application of these Rapid Single-Flux-Quantum (RSFQ) logic circuits is still narrow, due to 
the inconvenience of their deep refrigeration to temperatures below T c ? n 

Since we have already got the basic relations (57) and (59) describing the macroscopic quantum 
phenomena in superconductivity, let me mention in brief two other members of this group, called the 
Josephson effects. Differentiating Eq. (57) over time, and using the Faraday induction law (2), we get 38 

Josephson 
phase-to- 
voltage 
relation 

This famous phase-to-voltage relation should be valid regardless of the way how voltage V has been 
created, 39 so let us apply Eqs. (59) and (71) to the simplest circuit with a non-superconducting source of 
dc voltage - see Fig. 6. 


d(p _ 2e 
dt h 


(6.71) 



Fig. 6.6. DC-voltage-biased Josephson junction. 


37 For more on that technology, see, e.g., the review paper by P. Bunyk et al., Int. J. High Speed Electron. Syst. 

11 , 257 (2001) and references therein. 

38 Since the induced e.m.f. V m& cannot drop on the superconducting path between the Josephson junction 
electrodes 1 and 2 (Fig. 3), it should equal to (-V), where V is the voltage across the junction. 

39 It may be also obtained from simple Schrodinger equation arguments - see, e.g., QM Sec. 2.2. 
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If current / is below the critical value, 

-I c < I < +I C , (6.72) 

Eq. (59) allows phase cp to have a time-independent value 

^ = arcsin(/// c ), (6.73) 


and hence, according to Eq. (71), a vanishing voltage drop across the junction: V= 0. This dc Josephson 
effect is not quite surprising - indeed, we have postulated from the very beginning that the Josephson 
junction may pass a certain supercurrent. Much more fascinating is the so-called ac Josephson effect that 
takes place if voltage across the junction has a nonvanishing average (dc) component Vo. For simplicity, 
let us assume that this is the only voltage component: V(t) = Vo = const, 40 then Eq. (71) may be readily 
integrated to give cp= coj + cpo, where 


=Y V °- 

n 


(6.74) 


This result, plugged into Eq. (59), shows that supercurrent oscillates, 


I(<p) = I c sin (af + po), 


(6.75) 


with the Josephson frequency op (74), which is proportional to the applied dc voltage. For practicable 
voltages, frequency f = op! 2 tv corresponds to the GHz or even THz ranges, because the proportionality 
coefficient in Eq. (74) is very high: /j/Eo = 2 e/h ~ 483 MHz/pV. 41 An important experimental fact is the 
universality of this coefficient. For example, in the mid-1980s, the group led by Prof. J. Lukens of our 
department proved that this factor is material-independent with the relative accuracy of at least 10' 15 . 
Very few experiments, especially in solid state physics, have ever reached such precision. 

This fundamental nature of the Josephson voltage-to-frequency relation (74) allows an important 
application of the ac Josephson effect in metrology. Namely, phase locking the Josephson oscillations 
with an external microwave signal derived from an atomic frequency standards one can get the most 
precise dc voltage than from any other source. In NIST and other metrological institutions around the 
globe, this effect is used for the calibration of simpler “secondary” voltage standards that can operate at 
room temperature. 42 


6.5. Inductors, transformers, and ac Kirchhoff laws 

Let a wire coil (meaning either a single loop illustrated in Fig. 5.4b, or a series of such loops, 
such as one of the solenoids shown in Fig. 5.6) have size a that satisfies, at frequencies of our interest, 
the quasistatic limit condition a « 2. Moreover, let the coil’s self-inductance L be much larger than that 
of the wires connecting it to other components of our system: ac voltage sources, voltmeters, etc. (Since, 
according to Eq. (5.75), (5. 1 13), L scales as the number N of wire turns squared, this is easier to achieve 


40 In experiment, this condition is hard to implement, due to relatively high inductance of the current leads 
providing dc voltage supply. However, these complications do not change the main conclusion of the analysis. 

41 Thi s 1962 prediction by B. Josephson was confirmed experimentally - implicitly (by phase locking of the 
oscillations with an external oscillator) in 1963, and explicitly (by the detection of microwave radiation) in 1967. 

42 For more on the Josephson effect and other macroscopic quantum phenomena in superconductivity, see, e.g., 
Chapters 6 and 7 in the monograph by M. Tinkham, which was cited above. 
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at N » 1 .) Then in a system consisting of such Jumped induction coils and external wires (and other 
circuit elements such as resistors, capacitances, etc.), we may neglect the electromagnetic induction 
effects everywhere outside the coil, so that the electric field in those external regions is potential. Then 
the voltage V between coil’s terminals may be defined (as in electrostatics) as the difference of values of 
scalar potential 0 between the terminals, i.e. as integral 

V = JE • dr (6.76) 


between the coil terminals along any path outside the coil. This voltage has to be balanced by the 
induction e.m.f. (2) in the coil, so that if the Ohmic resistance of the coil is negligible, 43 we may write 



(6.77) 


where O is the magnetic flux in the coil. If the flux is due to the current / in the same coil only (i.e. if it 
is magnetically uncoupled from other coils), we may use Eq. (5.70) to get the well-known relation 

Voltage 
drop on 
inductance 
coil 

where the compliance with the Lenz sign rule is achieved by selecting the relations between the assumed 
voltage polarity and current direction as shown in Fig. 7a. 



(6.78) 



M W 


(c) 



Fig. 6.7. (a) Induction coil, (b) 
two inductively coupled coils, 
and (c) an ac transformer. 


If similar conditions are satisfied for two magnetically coupled coils (Fig. 6b), then, in Eq. (77), 
we need to use Eqs. (5.69) instead, getting 


jr , dl l dl 2 
V = L, — - + M- ~ 


dt 


dt 


jr , dl 2 df 
F, — L~ 1- M - 


dt 


dt 


(6.79) 


where the repeating index is dropped for notation simplicity. Such systems of inductively coupled coils 
have numerous applications in electrical engineering and physical experiment. 44 Probably the most 
important is the ac transformer (Fig. 6c) where both coils share a common soft-ferromagnetic core. As 
we already know, such material (with // » // 0 ) tries to not let any magnetic field lines out, so that the 
magnetic flux O(t) in the core is nearly the same in each of its cross-sections. This gives 


V^N^, V 2 *N 2 d -^, 
1 1 dt 2 2 dt 


(6.80) 


43 If the resistance is substantial, it may be represented, in calculations, by a separate lumped circuit element 
{resistor) connected in series with the coil. 

44 Starting from the pioneering experiments by M. Faraday - who invented this device. 
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where Ah ,2 is the number of wire turns in each coil, so that the voltage ratio is completely determined by 
N 1 /N 2 ratio. 

Now we may generalize, to the ac current case, the notion of an electric circuit, already 
discussed in Chapter 4 - see Fig. 4.3 reproduced in Fig. 8a below. Let not only wire inductances but also 
wire capacitances be negligible in comparison with those of compact ( lumped) capacitances. Then we 
may present the circuit as the connection of lumped circuit elements with ideal (voltage- and charge-free 
wires), with the list of its circuit elements now including not only resistors and current sources (as in the 
dc case), but also induction coils (including magnetically coupled ones) and capacitors - see Fig. 8b. 

(b) 


V = L— V = RI V = -\ Idt V = V(t) 
dt C i 

Fig. 6.8. (a) Typical ac system obeying ac Kirchhoff 
laws in the quasistatic approximation, and (b) the 
simplest circuit elements. 




In the quasistatic limit, current through each wire is conserved, so that the “node rule”, i.e. the 1 st 
Kirchhoff law (4.7), 

2A=o. < 6 - 8 » 

j 

remains valid. Also, if the electromagnetic induction effect is restricted to the interiors of lumped 
induction coils as discussed above, voltage drops Vk across each circuit element may be still presented, 
just as in dc circuits, as differences of potentials of the adjacent nodes, so that the “loop rule”, i.e. 2 nd 
Kirchhoff law given by Eq. (4.8), 

2>„=0. (6.82) 

k 

is also valid. 

In contrast to the dc case, Eqs. (81) and (82) are now the (ordinary) differential equations. 
However, if all circuit elements are linear (as in the examples presented in Fig. 8b), these equations may 
be readily reduced to linear algebraic equations using the Fourier expansion. (In the most common case 
of sinusoidal ac sources, the final stage of Fourier series summation is unnecessary.) I do not have time 
to discuss even the simplest examples of such circuits, such as LC, LR, RC, and LRC loops and periodic 
structures, 45 but my experience shows that the potential readers of these notes are well familiar with 
these problems from their undergraduate studies. Let me only emphasize again that the standard ac 


45 Interestingly, these effects include the wave propagation in periodic LC circuits, despite still staying within the 
quasistatic approximation! However, within this approximation, speed l/(LCj 12 of these waves is much lower 
than speed of electromagnetic waves in the surrounding medium - see the next chapter. 
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circuit theory is only valid within the quasistatic limit a « A, and only under the condition of the 
electric and magnetic field confinement inside lumped circuit elements. 


6.6. Displacement currents 

The electromagnetic induction is not the only new effect arising in non- stationary 
electrodynamics. Indeed, though Eqs. (16) are adequate for the description of quasistatic phenomena, a 
deeper analysis shows that one of these equations, namely V x H = j, cannot be exact. To see that, let us 
take the divergence of its both sides of this equation: 

V • (V x H) = V • j . (6.83) 

But, as the divergence of any curl, 46 the left hand part should equal zero. Hence we get 

Vj = 0. (6.84) 


This is fine in statics, but in dynamics this equation forbids any charge accumulation, because according 
to the continuity relation (4.5), 


V j = 


dp 

dt 


(6.85) 


This discrepancy had been recognized by James Clerk Maxwell who suggested, in 1864, a way 
out of this contradiction. If we generalize the equation for V x H by adding to term j (that describes real 
currents) the so-called displacement current term, 

Displacement 
current 
density 

(that of course vanishes in statics), then the equation takes the form 

VxH = j + j, =j + ff (6.87) 

dt 



( 6 . 86 ) 


In this case, due to equation V-D = p, the divergence of the right hand part equals zero due to the 
continuity equation, and the discrepancy is removed. 

This conclusion, and equation (87), are so important that it is worthwhile to have one more look 
at its derivation using a particular “electrical engineering” model shown in Fig. 8. 47 Neglecting the 
fringe field effects, we may use Eq. (4.1) to describe the relation between current / flowing through a 
wire and the electric charge Q of the capacitor: 48 


46 Again, see MA Eq. (1 1.2) if you need. 

47 No physicist should be ashamed of doing this. J. C. Maxwell himself has arrived at his equations with a heavy 
use of mechanical engineering arguments. (His main book, A Treatise of Electricity and Magnetism, is full of 
drawings of gears and levers.) More generally, the whole history of science teaches us that snobbishness toward 
engineering and other “not-a-real-physics” disciplines is a sure way toward producing nothing of either practical 
value or fundamental importance. In real science, any method leading to novel, correct results should be welcome. 

48 This is of course just the integral form of the continuity equation (85). 
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dQ 

dt 


= /. 


( 6 . 88 ) 


Now let us consider a closed contour C drawn around the wire. (Solid points in Fig. 9 show the places 
where the contour intercepts the plane of drawing.) This contour may be seen as either the line limiting 
surface S\ (crossed by the wire) or the line limiting surface Si (avoiding such crossing by passing 
through capacitor’s gap). Applying the macroscopic Ampere law (5.117) to the former surface, we get 

|H-dr = \j n d 2 r = I, (6.89) 

C Sj 

while for the latter surface the same law gives a different result, 

|H • dr = | j n d 2 r =0 , [WRONG!] (6.90) 

C S 2 

for the same integral. This is just an integral-form manifestation of the discrepancy outlined above, but it 
shows clearly how serious the problem is (or rather it was - before Maxwell). 


c •- 



Now let us see how the introduction of the displacement currents saves the day, considering for 
the sake of simplicity a plane capacitor of area A, with a constant electrode spacing. In this case, as we 
already know, the field inside it is uniform, with D = a, so that the total capacitor’s charge Q = Acr = 
AD, and current (88) may represented as 


T dQ dD 

I = — = A . 

dt dt 


(6.91) 


So, instead of Eq. (90), the modified Ampere law gives 


f H dr =[ti d )„d 2 r = \^~d 2 r = ^-A =/, 


C 


dt 


(6.92) 


i.e. the Ampere integral becomes independent of the choice of the (imaginary!) surface limited by 
contour C - as it should. 
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6.1 . Finally, the full Maxwell equation system 


Macroscopic 

Maxwell 

equations 


This is a very special moment in the course: with the displacement current introduction, we have 
finally arrived at the lull set of macroscopic Maxwell equations for time-dependent fields, 49 


( B 

5D . 

V x E + — = 0, 

V x H = j. 

8t 

dt J 

V-D = A 

V • B = 0, 


(6.93a) 

(6.93b) 


whose validity has been confirmed in by an enormous body of experimental data. 50 The most striking 
feature of these equations is that, even in the absence of (local) charges and currents, when all the 
equations become homogeneous, 


V x E = 


f B 

~8t’ 


v • d = o, 


V x H = 


OP 

~dt' 


V - B = 0, 


(6.94a) 

(6.94b) 


they still describe something very non-trivial: electromagnetic waves, including light. 51 Indeed, one can 
interpret Eqs. (94a) in the following way: the change of magnetic field creates, via the Faraday induction 
effect, a vortex (divergence-free) electric field, while the dynamics of the electric field, in turn, creates a 
vortex magnetic field via the Maxwell’s displacement currents. 

We will carry out a detailed quantitative analysis of the waves in the next chapter, but it is easy 
(and very instructive) to use the Maxwell equations to estimate their velocity v and the field amplitude 
ratio E/H in a medium with D = sE, B = // H, and j = 0. Indeed, let the solution of these equations, in a 
uniform, linear medium have a time period T, and hence the wavelength A = vT. Then the magnitude of 
the left-hand part of the first of Eqs. (94a) is of the order of El A ~ E/vT, while that of its right-hand part 
may be estimated as BIT = juH/T . Using similar estimates for the second of Eqs. (94a), we arrive at the 
following two requirements for the E/H ratio: 52 

(6.95) 

H sv 


In order to insure the compatibility of these two relations, the wave speed should satisfy the estimate 


v ~ 


1 

WE 1 


(6.96) 


1/2 

reduced to v ~ 1/Ub/io) = c in free space, while the ratio of the electric and magnetic field amplitudes 
should be of the following order: 


49 This vector form of the equations, magnificent it its symmetry and simplicity, was developed in 1884-85 by O. 
Heaviside, with substantial contributions by H. Lorentz. (The original Maxwell’s result, circa 1861, looked like a 
system of 20 equations for Cartesian components of the vector and scalar potentials.) 

50 Despite numerous efforts, no other corrections (e.g., additional terms) to Maxwell equations have been ever 
found, and these equations are still considered exact within the range of their validity, i.e. while the electric and 
magnetic fields may be considered classically. Moreover, even in quantum case, these equations are believed to 
be strictly valid as relations between the Heisenberg operators of the electric and magnetic field. 

51 Let me emphasize that this is only possible due to the “displacement current” term <3D Idt. 

52 The fact that T cancels shows (or rather hints) that these estimates are valid for waves of arbitrary frequency. 
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E_ 

H 


Mv 


1 



( E y /2 

UJ 


(6.97) 


In the next chapter we will see that these are indeed the exact results for a plane electromagnetic wave. 

Now let me fulfill the promise given in Sec. 2 and establish the validity limits of the quasistatic 
approximation (16). For that, let the spatial scale of our system be a, generally unrelated to wavelength 
X = vT, and carry real currents j producing certain magnetic field //. Then, according to Eqs. (94a), this 
magnetic field Faraday-induces electric field E ~ /uHa/T , whose displacement currents, in turn, produce 
an additional magnetic field with magnitude 


as „ 

as ua TT 

r ax') 

2 

( a ^ 

— E ~ 

- — —H~ 


H = 


T 

T T 

yVTX J 


uJ 


(6.98) 


Hence, at a « X, the displacement current effect is indeed negligible. 

Before going after the analysis of the full Maxwell equations in particular situations (that will be 
the main goal of all the next chapters of this course), let us have a look at the energy balance they yield 
for a certain volume V - that may include both charged particles and the electromagnetic field. Since, 
according to Eq. (5.10), the magnetic field does no work on charged particles even if they move, the 
total power P being transferred from the field to the particles inside the volume is due to the electric 
field alone: 

'P = \pd 3 r, r = j-E, (6.99) 


where I have used Eq. (4.38). Expressing j from the corresponding Maxwell equation of system (93), 
and plugging it into Eq. (99), we get 


P= f E ■ (V x H) - E • — — 
1 dt 


a r. 


FL 


(6.100) 


Let us pause here for a second, and transfonn the divergence of vector ExH using the well- 
known vector algebra identity: 53 

V • (E x H) = H • (V x E) - E • (V x H) . (6.101) 


The last term in the right-hand part of this equation is exactly the first term in the square brackets of Eq. 
(100), so that we can rewrite that formula as 


P 


-f 


cD 


- V • (E x H) + H • (V x E) - E 


v L 


8t 


dr. 


( 6 . 102 ) 


However, according to the Maxwell equation for V x E, it is equal to - 8B/8t, so that the second tenn in 
the square brackets of Eq. (102) equals -H-dB/dt and, according to Eq. (5.128), is just the (minus) time 
derivative of the magnetic energy per unit volume. Similarly, according to Eq. (3.82), the third tenn 
under the integral is the minus time derivative of the electric energy per unit volume. Finally, we can use 
the divergence theorem to transfonn the integral of the first term to a 2D integral over the surface S 


53 See, e.g., MA Eq. (11.7) with f = E and g = H. 
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Poynting 

theorem 


Electro- 

magnetic 

energy 

density 


Poynting 

vector 


limiting volume V. As the result, we get the so-called Poynting theorem 54 for the power balance in the 
system: 



(6.103) 


Here u is the density of the total (electric plus magnetic) energy of the electromagnetic field, with 


Su = E-(5D + H-(5B, 


(6.104a) 


so that for an isotropic, linear, and dispersion-free medium, with D (t) = sE(f), B(/) = ju\\(t), 


ED H B 

u = 1 

2 2 


sE 2 B 2 

~T + 2/u’ 


(6.104b) 


and S is the Poynting vector defined as 55 

S = E x H . 


(6.105) 


The first integral in Eq. (103) is evidently the net change of the energy of the system (particles + 
field) in unit time, so that the second (surface) integral is certainly the power flowing out from the 
system through the surface, and it is tempting to interpret the Poynting vector S locally, as the power 
flow density at the given point. 56 In many cases such a local interpretation of vector S is legitimate; 
however, in some cases it may lead to wrong conclusions. Indeed, let us consider a simple system shown 
in Fig. 10: a planar capacitor placed into a static and uniform external magnetic field so that the electric 
and magnetic fields are mutually perpendicular. In this static situation, no charges are moving, both p 
and d/dt equal to zero, and there should be no power flow in the system. However, Eq. (105) shows that 
the Poynting vector is not equal to zero inside the capacitor, being directed as shown in Fig. 10. 



Fig. 6.10. The Poynting vector paradox. 


From the point of view of our only unambiguous corollary of the Maxwell equations, Eq. (103), 
there is no contradiction here, because the fluxes of vector S through the walls of any volume V, for 
example the side walls of the volume shown with dashed lines in Fig. 10, are equal and opposite (and 
they are zero for other faces of this rectilinear volume), so that the total flux of the Poynting vector 


54 Called after J. Poynting, though this fact was independently discovered by O. Heaviside, while a similar 
expression for the intensity of mechanical elastic waves had been derived earlier by N. Umov. 

55 Actually, an addition to S of the curl of an arbitrary vector function f(r, t) does not change Eq. (103). Indeed, 
we may use the divergence theorem to transform the corresponding change of the surface integral in Eq. (103) to a 
volume integral of scalar function V-(Vxf) that equals zero at any point - see, e.g., MA Eq. (1 1.2). 

56 Later in the course we will show that the Poynting vector is also directly related to the density of momentum of 
the electromagnetic field. 
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equals zero, as it should. It is, however, useful to recall this example each time before giving the local 
interpretation to vector S. 


Finally, to complete the initial discussion of the Maxwell equations, 57 let us rewrite them in 
terms of potentials A and </>, because this is more convenient for the solution of some (though not all!) 
problems. Even when dealing with a more general system (93) of Maxwell equations than before, Eqs. 
(7) and (5.27), 


E = -V0 


dA 

~dt' 


B = V x A, 


(6.106) 


are still used as potential definitions. It is straightforward to verify that with these definitions, two 
homogeneous Maxwell equations (93b) are satisfied automatically. Plugging Eqs. (106) into the 
inhomogeneous equations (93a), and considering, for simplicity, a linear, unifonn medium with 
frequency-independent s and //, we get 


V^ + ^( V 'A) = - £ 

dt £ 


? 2 a „.. c A _ v 


V A- £ju 


St 1 


dtj)^ 

V • A + £u — 
dt 


~MY 


(6.107) 


This is a more complex result than what we would like to get. However, let us select a special 
gauge that is frequently called (especially for the free space case, when v = c) the Lorenz gauge 
condition 58 


V • A + s/j 


d(j) 

dt 


-0, 


(6.108) 


which is a natural generalization of the Coulomb gauge (5.48) for time-dependent phenomena. With this 
condition, Eqs. (107) are reduced to a simpler, beautifully symmetric form: 59 


vV- 

V 2 A- 


1 8 2 (j) _ p 

1 d 2 A 


(6.109) 


2 

where v = l/s/u : 60 


57 We will return to their general discussion (in particular, to the analytical mechanics of the electromagnetic 
field, and its stress tensor) in Sec. 9.8, after we have got equipped with the special relativity theory. 

58 This condition, named after L. Lorenz, should not be confused with the Lorentz invariance condition of the 
relativity theory, due to H. Lorentz (note the names’ spelling) - see Sec. 9.4. 

59 Note that Eqs. (109) are essentially a set of 4 similar equations for 4 scalar functions (namely, (f> and three 
Cartesian components of vector A) and thus clearly invite the 4-component vector formalism of the relativity 
theory - which will be discussed in Chapter 9. 

60 Here I have to mention in passing the so-called Hertz vector potentials Tf and n,„ (whose introduction may be 
traced to at least the 1904 work by E. Whittaker). They may be defined by the following relations: 

ptt 1 

A = M— -+Mvn m , </> =— v-n e , 

dt s 


Electro- 

magnetic 

field 

potentials 


Lorenz 

gauge 

condition 


Potential 

dynamics 


Chapter 6 


Page 28 of 32 


Essential Graduate Physics 


EM: Classical Electrodynamics 


If (f> and A depend on just one spatial coordinate, say z, in a region without field sources: p= 0, j 
= 0, Eqs. (109) are reduced to the well-known ID wave equations 


d 2 (j) 1 d 2 (j) _ 

VHt 2 ' 

d 2 A I G 2 A 
~d*z 


( 6 . 110 ) 


describing waves propagating with velocity v. Note that due to the definitions of constants So and juo, in 
free space v is just the speed of light: 


v = 



= c . 


( 6 . 110 ) 


Historically, the experimental observation of relatively low-frequency (GHz-scale) electromagnetic waves 
and the proof that their speed in free space is equal to that of light, was the decisive proof of Maxwell’s 
theory. 61 A detailed study of this most important physical phenomenon is the main goal of the next 
chapters of this course. 


6.8 Exercise problems 

6.1 . Prove that the electromagnetic induction e.m.f. V mA in a 
conducting loop may be measured: 

(i) by measuring the current / = V mA /R induced in the closed loop with 
Ohmic resistance R, or 

(ii) using a voltmeter inserted into the loop - see Fig. on the right. 



6.2 . The magnetic flux O that pierces a plane, round, uniform, 
resistive ring is being changed in time, while the magnetic field outside 
of the ring is negligibly low. A voltmeter is connected to a part the ring 
as shown in Fig. on the right. What would the voltmeter show? 



6.3 . A weak, uniform magnetic field B is applied to an axially-symmetric permanent magnet, 
with a dipole magnetic moment m directed along its symmetry axis, rapidly rotating about the same 
axis, with an angular momentum L. Calculate the electric field resulting from field’s application, and 
formulate the conditions of your result’s validity. 


which make the Lorenz gauge condition (108) automatically satisfied. These potentials are especially convenient 
for the solution of problems in which the electromagnetic field is excited by external sources characterized by 
externally fixed electric and magnetic polarizations P ex t and M ex t - rather than fixed charge and current densities p 
and j. Indeed, it is straightforward to check that both Tf and n„, satisfy equations similar to Eqs. (109), but with 
the right-hand parts equal to, respectively, -P ext and -M ext . 

61 This was first accomplished in 1886 by H. Hertz, using specially designed electronic circuits and antennas. 
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6.4 . Use the electromagnetic induction law (5) to derive Eq. (5.128) for the magnetic field 
energy variation. 


6.5 . A unifonn, static magnetic field B is applied along the axis of a 
long round pipe of a radius R, and a very small thickness r, made of a 
material with Ohmic conductance cr. A sphere of mass M and radius R’ <R, 
made of a linear magnetic with permeability // » juq, is launched, with an 
initial velocity vo, to fly ballistically along pipe’s axis - see Fig. on the 
right. Use the quasistatic approximation to calculate the distance the sphere 
would pass before it stops. Fonnulate conditions of validity of your result. 



6.6 . AC current of frequency co is being passed through a long unifonn wire with a round cross- 
section of radius R that is comparable with the skin depth 8 S . In the quasistatic approximation, find the 
current density distribution across the wire. Analyze the limits R « 8 S and R » 8 S . 

6.7 . A very long, round cylinder of radius R, made of a uniform Ohmic 
conductor with conductivity a and magnetic pennittivity //, has been placed into 
a unifonn ac magnetic field H ex t = Hocosruf, directed along its symmetry axis - 
see Fig. on the right. Calculate the spatial distribution of the magnetic field’s ® 
amplitude, and in particular its value on cylinder’s axis. Spell out the last result 
in the limits of relatively small and large R. 

Hint : As shortcuts, you are welcome to reuse parts of the solution of the 
previous problem. 



6.8 .* Define and calculate an appropriate spatial-temporal Green’s function for Eq. (20), and use 
this function to analyze the dynamics of propagation of the external magnetic field, suddenly turned on 
at t = 0 and then left constant: 


H(x = 0,t) 



at t < 0, 
at t > 0, 


into an Ohmic conductor occupying half-space x > 0 - see Fig. 6.2. 

2 2 

Hint : Try to use a function proportional to exp{-(x - x’) !2(8x) }, with a suitable time 
dependence of parameter 8x, and a properly selected pre-exponential factor. 


6.9 . Solve the previous problem using the variable separation method, and compare the results. 

6.10 . A small, planar wire loop, carrying current /, is located far from a plane surface of a 
superconductor. Within the “macroscopic” description of superconductivity (B = 0), find: 

(i) the energy of the loop-superconductor interaction, 

(ii) the force and torque acting on the loop, 

(iii) the distribution of supercurrents on the superconductor surface. 
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6.1 1 . A straight, uniform magnet of length 21, cross-section area A 
« l , and mass m, with a permanent longitudinal magnetization Mo, is 
placed over a horizontal surface of a superconductor - see Fig. on the 
right. Within the macroscopic model of the Meissner-Ochsenfeld effect, 
find the stable equilibrium position of the magnet. 

6.12 . A plane superconducting wire loop, of area A and 

inductance L, may rotate, without friction, about a horizontal axis 0 (in ~ f) 

Fig. on the right, perpendicular to the plane of drawing) passing 

through its center of mass. Initially the loop was horizontal (with 9=0), g 

and carried supercurrent I 0 in such direction that its magnetic dipole 
vector was directed down. Then a uniform magnetic field B, directed 

vertically up, was applied. Find all possible equilibrium positions (angles 9) of the loop, analyze their 
stability, and give a physical interpretation of the results. 

6.13 . Use the London equation to analyze the penetration of external magnetic field into a thin ( t 
~ A.), planar superconductor film whose plane is parallel to the field. 



6.14 . Use the London equation to calculate the distribution of supercurrent density j across the 
circular cross-section (with radius R ~ Sl) of a long, straight superconducting wire carrying dc current I. 


6.15 . Use the London equation to calculate the 
inductance (per unit length) of a long, uniform superconducting 
strip placed close to the surface of a similar superconductor - 
see Fig. on the right, which shows the structure’s cross-section. 


t ~ 8, 


Hint : Start from thinking how is the supercurrent 
distributed along the surfaces of the strip and the bulk superconductor. 


w» d 




< — 

> 


® / 


$d»S L 


A 


6.16 . Analyze the magnetic field shielding by a superconducting film of small thickness t « S f, 
by calculating the penetration of the field induced by current / flowing in a thin wire which runs parallel 
to a wide, plane thin film, at distance d» t from it, into the half-space behind the film. 


6.17 . Calculate the self-inductance of a superconducting cable with a 
round cross-section (see Fig. on the right) in the following limits: 

(i) Sl « a, b, c - b, and 

(ii) a « 8l« b, c - b. 



6.18 . Use Eqs. (59) and (71) to calculate the energy of a Josephson junction, and the full energy 
of the SQUID shown in Fig. 4c. 


Chapter 6 


Page 31 of 32 







Essential Graduate Physics 


EM: Classical Electrodynamics 


6.19 . Analyze the possibility of wave propagation in a long, 
uniform chain of lumped inductances and capacitances - see Fig. on 
the right. 


JYYYUJYYYU^iYYYL 
C-L r;-L 

T T 


Hint : Readers without prior experience with electromagnetic 
wave analysis may like to use a substantial analogy between this effect and mechanical waves in a ID 
chain of elastically coupled particles. 62 


6.20 . A sinusoidal e.m.f. of amplitude Vo and frequency co 
is applied to an end of a long chain of similar, lumped resistors 
and capacitors (see Fig. on the right). Calculate the law of decay 
of the rf oscillation amplitude in the chain. 



R 


R 


xA/WxMA r 


c 




6.21 . Calculate the pressure exerted by the magnetic field B inside a magnetic-free solenoid of 

2 

length /, cross-section area A « l and N turns, on its “walls” (windings), and the force exerted by the 
field on solenoid’s ends. Give a physical interpretation of the direction of these forces. 


6.22 . In Sec. 6 we have seen that the displacement current concept allows one to generalize the 
Ampere law to time-dependent processes as 


oH • dr = I s + 



2 


r . 


We also have seen that such generalization makes Jh ■dr over the 
contour C, which was shown in Fig. 9 (see also Fig. on the right), 
independent of the choice of surface S limited by the contour. However, it 
may look like the situation is different for contour a C’ drawn inside the 
capacitor. Indeed, if contour’s radius p is much larger than the capacitor’s 
thickness d, the magnetic field H, created by the linear current / of the 
contour line is virtually the same as that of a continuous wire, and hence 
integral jH-r/r along contour C’ is apparently the same as that along contour 
C, while the magnetic flux \D n d 2 r through the surface S ’ limited by contour 
C’ is evidently smaller, while !$• = A = 0, so that the above equation seems 
invalid. Resolve the paradox, for simplicity considering an axially- 
symmetric system. 


Cm 



62 See, e.g., CM Sec. 5.3. 
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Chapter 7. Electromagnetic Wave Propagation 

This (long!) chapter focuses on the most important effect that follows from the time- depen dent Maxwell 
equations, namely the electromagnetic waves, at this stage avoiding a discussion of their origin, i.e. 
radiation. I start from the simplest, plane waves in a uniform and isotropic media. The next step is a 
discussion non-uniform systems, in particular those with sharp boundaries between different materials, 
which bring in such new effects as wave reflection and refraction. Then I will proceed to the structure of 
electromagnetic waves propagating along various long, cylindrical structures, called transmission lines 
- such as coaxial cables, waveguides, and optical fibers. In the end of the chapter, electromagnetic 
oscillations in final-length fragments of such lines, serving as resonators, are also discussed. 


7.1. Plane waves 

Let us start from considering a spatial region that does not contain field sources (p = 0, j = 0), 
and is filled with a linear, unifonn, isotropic medium, which obeys Eqs. (3.38) and (5.1 10): 

D = £E, B = //H. (7.1) 


Maxwell 
equations 
for uniform 
linear 
media 


Moreover, let us assume for a minute that these material equations hold for all frequencies of interest. 
As was already shown in Sec. 6.7, in this case the Lorenz gauge condition (6.108) allows the Maxwell 
equations to be recast into wave equations (6.110) for the vector and scalar potentials. However, for 
most our purposes it is more convenient to use directly the homogeneous Maxwell equations (6.94) for 
the electric and magnetic fields - which are independent of the gauge choice. After the elementary 
elimination of D and B using Eq. (1), 1 these equations take a simple, symmetric fonn 


r)TT 

VxE + // = 0, 

dt 

VxH-£- = 0, 
dt 

V • E = 0, 

V • H = 0. 


(7.2a) 

(7.2b) 


Now, taking the curl (Vx) of each of Eqs. (2a), and using the vector algebra identity (5.31), whose first 
term, for both E and H, vanishes due to Eqs. (2b), we get similar wave equations for the electric and 
magnetic fields: 

Electro- 
magnetic 
wave 
equations 



where parameter v is defined by relation 


EM wave 
velocity 


2 2 

with v = 1/soPo = c in free space. 



(7.4) 


1 Though B rather then H is the actual (microscopically-averaged) magnetic field, it is mathematically more 
convenient (just as in Sec. 6.2) to use the latter vector in our current discussion, because at sharp media 
boundaries, H obeys the boundary condition (5.1 18) similar to that for E - see Eq. (3.47). 


© 2013-2016 K. Likharev 


Essential Graduate Physics 


EM: Classical Electrodynamics 


Two vector equations (3) are of course six similar equations for three Cartesian components of 
two vectors E and H. Each of these equations allows, in particular, the following solution, 


f = f(z-vt). 


(7.5) 


where z is the Cartesian coordinate along a certain (arbitrary) direction n. This solution describes a 
specific type of a wave, i.e. a certain field pattern moving, without deformation, along axis z, with 
velocity v. According to Eq. (5), each variable/ has the same value in each plane perpendicular to the 
direction n of wave propagation, hence the name -plane wave. 


According to Eqs. (2), the independence of the wave equations (3) for vectors E and H does not 
mean that their plane -wave solutions are independent. Indeed, plugging solution (5) into Eqs. (2a), we 
get 


H = 


nxE 

“z - ’ 


i.e. E = ZHx n , 


(7.6) 


where constant Z is defined as 



r V /2 

A 


(7.7) 


The vector relation (6) means, first of all, that vectors E and H are perpendicular not only to 
vector n (such waves are called transverse), but also to each other (Fig. 1) - at any point of space and at 
any time instant. 



Fig. 7.1. Field vectors in a plane electromagnetic 
wave propagating along direction n. 


Second, the field magnitudes are related by constant Z, called the wave impedance of the 
medium. Very soon we will see that the wave impedance plays a pivotal role in many problems, in 
particular at the wave reflection from the interface between two media. Since the dimensionality of E, in 
SI units, is V/m, and that of H is A/m, Eq. (7) shows that Z has the dimensionality of V/A, i.e. ohms 
(Q). 2 * In particular, in free space, 




1/2 

N 

II 

N 

O 

III 

Ao 

= 4;rxl0~ 7 c « 377 f2 . 


l £ o J 



(7.8) 


Now plugging Eq. (6) into Eqs. (6.104b) and (6.105), we get: 

u = sE 2 = pH 2 , 


(7.9a) 


2 In Gaussian units, E and H have the same dimensionality (in particular, in a ffee-space wave, E = H), making the 

(very useful) notion of the wave impedance less manifestly exposed - and in some textbooks not mentioned at all. 
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Wave’s 
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S = E x H = n — = nZH 2 , 
Z 


(7.9b) 


so that, according to Eqs. (4) and (7), wave’s energy and power densities are universally related as 

S = nnv . (7.9c) 

In the view of the Poynting vector paradox discussed in Sec. 6.7 (see Fig. 6.10), one may wonder 
whether this expression may be interpreted as the actual density of power flow. In contrast to the static 
situation shown in Fig. 6.7, that limits the electric and magnetic fields to a vicinity of their sources, 
waves may travel far from them. As a result, they can form wave packets of finite length in free space - 
see Fig. 2. 





V wave packet 


Fig. 7.2. Interpreting the Poynting vector 
in an electromagnetic wave. 


Fet us apply the Poynting theorem (6.103) to the cylinder shown by dashed lines in Fig. 2, with 
one lid inside the wave packet, and another lid in the region already passed by the wave. Then, 
according to Eq. (6.103), the rate of change of the full energy 6- inside the volume is dZ/dt = -SA (where 
A is the lid area), so that S may be indeed interpreted as the power flow (per unit area) from the volume. 
Making a reasonable assumption that the finite length of a sufficiently long wave packet does not affect 
the physics inside it, we may indeed interpret the S given by Eq. (9) as the power flow density inside a 
plane electromagnetic wave. 

As we will see later in this chapter, the free-space value Z 0 of the wave impedance, given by Eq. 
(8), establishes the scale of wave impedances of virtually all wave transmission lines, so we may use is 
and Eq. (9) to get some sense of how different are the electric and magnetic field amplitudes in the 
waves, on the scale of typical electrostatics and magnetostatics experiments. For example, according to 
Eqs. (9), a wave of a modest intensity S = 1 W/m (the power density we get from a usual electric bulb a 
few meters away from it) has E ~ (SZ {) ) “ ~ 20 V/m, quite comparable with the dc field created by an 

1 n 

AA battery right outside it. On the other hand, the wave’s magnetic field H = (S/Zo) « 0.05 A/m. For 
this particular case, the relation following from Eqs. (1), (4), and (7), 


E E 

B = pH = p— = p— — 
Z [pis) 


/ 2 


= (s<u) V2 E = -, 


(7.10) 


o 

gives B = jUq H = E/c ~ 7x10' T, i.e. a magnetic field thousand times less than the Earth field, and about 8 
orders of magnitude lower than the field of a typical permanent magnet. A possible interpretation of this 
huge difference is that the scale of magnetic fields B ~ Elc in the waves is “normal” for 
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electromagnetism, while that of permanent magnet fields is abnormally high, because they are due to the 
ferromagnetic alignment of electron spins, essentially quantum objects - see the discussion in Sec. 5.5. 


As soon as a and p are simple constants, wave speed v is also constant, and Eq. (5) is valid for 
an arbitrary function/ - defined by either initial or boundary conditions. In plain English, a medium 
with frequency-independent a and // supports propagation of plane waves with an arbitrary waveform 
without either decay ( attenuation ) or deformation ( dispersion ). However, for any real medium but pure 
vacuum, this approximation is valid only within limited frequency intervals. We will discuss the effects 
of attenuation and dispersion in the next section and see that all our prior results remain valid even in 
that general case, provided that we limit them to single-frequency (i.e. sinusoidal, or monochromatic ) 
waves. Such waves may be most conveniently presented as 3 


/ = Re 


L e 


i(kz - at] 


(7.11) 


where /„ is the complex amplitude of the wave, and k is its wave number (the magnitude of wave vector 
k = n k), sometimes also called the spatial frequency. The last term is justified by the fact, evident from 
Eq. (11), that k is related to the wavelength 2 exactly as the usual (“temporal”) frequency <x> is related to 
the time period T\ 


, In 

2 n 

k = — , 

co = — . 

2 

T 


(7.12) 


Requiring Eq. (11) to be a particular form of Eq. (5), i.e. the argument ( kz - cot) = k[z - (co/k)l] to be 
proportional to (z - vt ), so that co/k = v, we see that the wave number should equal 




(7.13) 


showing that in this “dispersion-free” case the dispersion relation oik) is linear. 


Now note that Eq. (6) does not claim mean vectors E and H retain their direction in space. (The 
simple case when they do is called the linear polarization of the wave.) Indeed, nothing in the Maxwell 
equations prevents, for example, joint rotation of this pair of vectors around the fixed vector n, while 
still keeping all these three vectors perpendicular to each other at all times. An arbitrary rotation law, or 
even an arbitrary constant frequency of such rotation, however, would violate the single-frequency 
(monochromatic) character of the elementary sinusoidal wave (11). In order to understand what is the 
most general type of polarization the wave may have without violating that condition, let us present two 
Cartesian components of one of these vectors (say, E) along any two fixed axes x andy, perpendicular to 
each other and axis z (i.e. vector n), in the same form as used in Eq. (11): 


£ = Re 


E e 

COX 


i[kz - at] 


E„ = Re 


E e 

o)y 


i[kz - at] 


(7.14) 


In order to keep the wave monochromatic, complex amplitudes E m and E 0/i . must be constant; however, 
they may have different magnitudes and an arbitrary phase shift between them. 


3 Due to the linearity of Eqs (2), operator Re in Eq. (11) may be ignored until the end of almost any calculation. 
Because of that, the exponential presentation of monochromatic variables is more convenient than manipulation 
with sine and cosine functions. (See also CM Sec. 4.1.) 


Mono- 

chromatic 

wave 


Spatial and 

temporal 

frequencies 


Dispersion 

relation 
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In the simplest case when the arguments of the complex amplitudes are equal, 

E = \E I e i(p . 

cox,y | cox,y | 


(7.15) 


the real field components have the same phase: 

E x,y = \ E mx,y | C0S ( kz -G* + <P), (7.16) 

so that their ratio is constant in time - see Fig. 3a. This means that the wave is linearly polarized, within 
the plane defined by relation 


tan# = 



(7.17) 



Fig. 7.3. Time evolution of the electric field vector in (a) linearly-polarized, (b) circularly -polarized, and 
(c) elliptically-polarized waves. 


Another simple case is when the moduli of the complex amplitudes E mx and E coy are equal, but 
their phases are shifted by +/r/2 or -nS2\ 

j{(p ±71 ! 2 ) 


E = E e ,(p , E = E \e 

cox CO ’ my co' 


(7.18) 


In this case 


f 71 ^ 


E x = | E c0 1 cos(kz — cot + (p), E y = | E m | cos kz-cot + (p±— = +| E m | sin (kz -cot + cp). (7.19) 

V 2) 


This means that on the [ x , y] plane, the end of vector E moves, with wave’s frequency co, either 
clockwise or counterclockwise around a circle - see Fig. 3b: 

0(t) = +(cot - cp) . (7.20) 


Such waves are called circularly-polarized : 4 These particular solutions of the Maxwell equations 
are very convenient for quantum electrodynamics, because single electromagnetic field quanta with a 


4 In the convention that dominates research and engineering literature (but unfortunately is not universal), the 
wave is called right-polarized (RP) if it is described by the lower sign in Eqs. (18)-(20), and left-polarized (LP) in 
the opposite case. Another popular term for these cases is the “waves of negative / positive helicitv ”, 
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certain (positive or negative) spin direction may be considered as elementary excitations of the 
corresponding circularly-polarized wave. (This fact does not exclude, from the quantization scheme, 
waves of other polarizations, because any monochromatic wave may be presented as a linear 
combination of two circularly-polarized waves with opposite helicities, just as Eqs. (14) present it as a 
linear combination of two linearly-polarized waves.) 

Finally, in the general case of arbitrary complex amplitudes E ax and E my , the electric field vector 
end moves along an ellipse on the [x, y] plane (Fig. 3c), such wave is called elliptically polarized. The 
eccentricity and orientation of the ellipse are completely described by one complex number, the ratio 
E ml E ay, i.e. two real numbers: \E 0JX /E 0)} \ and cp = arg(E 0JX /E 0Jy ). 

The same information may be expressed via four so-called Stokes parameters So, s\, S2, S3, which 
are popular in optics because they may be used for the description of not only completely coherent 
waves that are discussed here, but also of party coherent or even fully incoherent waves - including the 
natural light emitted by thermal sources like our Sun. In contrast to the notion of coherent waves whose 
complex amplitudes are considered deterministic numbers, the instant amplitudes of incoherent waves 
should be treated as stochastic variables. 5 


7.2. Attenuation and dispersion 

Now let me show that any linear, isotropic medium may be characterized, by complex, 
frequency-dependent electric permittivity (foS) and magnetic penneability p((o). Indeed, starting from 
electric effects, the electric polarization of realistic elementary dipoles of the medium cannot follow the 
applied electric field instantly, if the field frequency co is comparable with those of the internal processes 
- say, transitions between atomic energy levels. Fet us consider the most general law of time evolution 
of polarization P(t) for the case of arbitrary applied electric field E(t), 6 but for a sufficiently dilute 
medium, so that the local electric field E e f (3.63), acting on each elementary dipole, is essentially the 
microscopically-averaged field E. 7 * * Then, due to the linear superposition principle, P{t) should be a 
linear sum (integral) of the values of E(t’) at all previous moments of time, t’ < t, weighed by some 
function of t and t ’: 

Temporal 

(7.21) Green’s 

fnnntion 


The condition t’<t, which is implied by this relation, expresses a key principle of physics, the 
causal relation between a cause (in our case, the electric field applied to each dipole) and its effect (the 


5 For further reading about the Stokes parameters, as well as about many optics topics I will not have time to 
cover (especially the geometrical optics and the diffraction-imposed limits on optical imaging resolution), I can 
recommend the classical text by M. Bom et al.. Principles of Optics, 7 th ed., Cambridge U. Press, 1999. 

6 In an isotropic media, vectors E, P, and hence D = £qE + P, are all parallel, and for the notation simplicity I will 
drop the vector sign. I am also assuming that P at any point r is only dependent on the electric field at the same 
point, and hence drop tenn ikz from the exponent’s argument. This assumption is valid if wavelength X is much 
larger than the elementary media dipole size a. In most systems of interest, the scale of a is atomic (~10 I0 m), so 
that the last approximation is valid up to very high frequencies, a>~ cl a ~ 10 18 s' 1 , corresponding to hard X-rays. 

7 Note that this condition (which excludes, in particular, the molecular-field effects discussed in Sec. 3.5) is not 

mentioned in most E&M textbooks. If the molecular fields are important, Eq. (21) and its corollaries are only 

valid for the relation between P and the effective local electric field E ef . 


P(t)= J E(t')G(t,t')dt' . 
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polarization it creates). Function G(t, t’) is called the temporal Green ’s function for the electric 
polarization. 8 In order to understand its physical sense, let us consider the case when the applied field 
E(t) is a very short pulse at t = to, that may be approximated with the Dirac’s delta-function: 

E(t) = S(t-t 0 ). (7.22) 

Then Eq. (21) yields just P(t) = G(t, to), showing that the Green’s function is just the polarization at 
moment t, created by a unit (5-functional pulse of the applied field at moment t’ (Fig. 4). Thus, the 
temporal G is the exact time analog of the spatial Green’s functions G(r, r’) we have already studied in 
the electrostatics - see Sec. 2.7. 



Fig. 7.4. Temporal Green’s function for 
electric polarization (schematically). 


What are the general properties of the temporal Green’s function? First, the function is evidently 
real, since the dipole moment p and hence polarization P = up are real by the definition - see Eq. (3.6). 
Next, for systems without infinite internal memory, G should tend to zero at t — t’ — » qo, although the 
type of this approach (e.g., whether function G oscillates approaching zero) depends on the medium. 
Finally, if parameters of the medium do not change in time, the polarization response to an electric field 
pulse should depend not on its absolute timing, but only on the time difference 6 = t - t’ between the 
pulse and observation instants: 


t oo 

P(t) = J E(t')G(t - t')dt' = | E(t - 0)G{0)d6 . 

-oo 0 


For a sinusoidal waveform, E{t) = Re \E 0 f lb \ this equation yields 


P(0 = RejXe i0Jit d) G(6)dO = Re 
0 


J G{0)e icoe d6 

o 



(7.23) 


(7.24) 


The expression in square brackets is of course nothing more that the complex amplitude P m of the 
polarization. This means that though even if the static relation (3.35) P = / C &E is invalid for an arbitrary 
time-dependent process, we may still keep its Fourier analog, 


P co = X, 


1 oo 

.(a)e 0 E a , with z e ( a ) = — J G{6)e cod dQ . 

£ 0 0 


(7.25) 


8 A discussion of the temporal Green’s functions in application to classical oscillations may be also found in CM 
Sec. 4.1. 
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for each sinusoidal component of the process, using it as the definition of the frequency-dependent 
electric susceptibility x e (cb). Similarly, the frequency-dependent electric permittivity may be defined 
using the Fourier analog of Eq. (3.38): 


D (0 = s{co)E co 


(7.26) 


Then, according to Eq. (3.36), the permittivity is related to the temporal Green’s function by the usual 
Fourier transform: 


Complex 

electric 

permittivity 


£{(0) = S 0 +-^ = S 0 + 

E,, 


J G(6)e ico ® d 6 . 

o 


(7.27) 


It is evident from this expression that s( co) may be complex, 


s(a>) = £'(a>) + is" {(d), s\o)) = s 0 +^G(0)cosco6 d6, ^"(<x>) = J C7(6 > ) sin , (7.28) 

0 0 


and that its real part s\cd) is always an even function of frequency, while the imaginary part s”(co) is an 
odd function of co. 


Absolutely similar arguments show that the linear magnetic properties may be characterized with 
complex, frequency-dependent permeability //(&>). Now rewriting Eqs. (1) for the complex amplitudes of 
the fields at a particular frequency, we may repeat all calculations of Sec. 1, and verify that all its results 
are valid for monochromatic waves even for a dispersive (but necessarily linear!) medium. In particular, 
Eqs. (7) and (13) now become 


Z(a>) = 


s{co) 


1/2 


k(cd) = co\s(co)p{co)\ 12 , 


(7.28) 


so that the wave impedance and wave number may be both complex functions of frequency. 

This fact has important consequences for the electromagnetic wave propagation. First, plugging 
the presentation of the complex wave number as the sum of its real and imaginary parts, k( co) = k’(cd) + 
ik”(cd), into Eq. (1 1): 

/ = R e{/ ffl e p( ® )z_fflr] }= e~ k " (oj)z R e{/> /[ * ,( * )z_fi * ] }, (7.29) 


we see that k” (co) describes the rate of wave attenuation in the medium at frequency co. 9 Second, if the 
waveform is not sinusoidal (and hence should be presented as a sum of several/many sinusoidal 
components), the frequency dependence of k’(cd) provides for wave dispersion, i.e. the waveform 
deformation at the propagation, because the propagation velocity (4) of component waves is now 
different. 10 


9 It may be tempting to attribute this effect to wave absorption, i.e. the dissipation of the wave’s energy, but we 
will see very soon that wave attenuation may be also due to effects different from absorption. 

10 The reader is probably familiar with the most noticeable effect of the dispersion, namely the difference between 
that group velocity v gT = dco / dk\ giving the speed of the envelope of a wave packet with a narrow frequency 
spectrum, and the phase velocity v P h = cdk’ of the component waves. The second-order dispersion effect, 
proportional to d 2 cold 2 k’, leads to the deformation (gradual broadening) of the envelope itself. Following tradition, 
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Let us consider a simple but very important Lorentz oscillator model of a dispersive medium. * 11 
In dilute atomic or molecular systems (including gases), electrons respond to the external electric field 
especially strongly when frequency co is close to certain eigenfrequencies coj corresponding to the 
spectrum of quantum transitions of a single atom/molecule. An approximate, phenomenological 
description of this behavior may be obtained from a classical model of several externally-driven 
harmonic oscillators with finite damping. For an oscillator, driven by electric field’s force F(t) = qE(t), 
we can write the 2 nd Newton law as 

Lorentz 
oscillator 
model 

where coo is the own frequency of the oscillator, and 8 its damping coefficient. For a sinusoidal field, 
E(t) = Re \E ( jzxp{-icot}], we can look for a particular, forced-oscillation solution in a similar form x(t) = 
Re [x ft exp {-icot}]. 12 Plugging this solution into Eq. (30), we can readily find the complex amplitude of 
these oscillations: 


m(x + 2 8x + ® 0 2 x) = qE(t ) 


(7.30) 


x 


CO 


q E co 

m ( col - co 2 )-2ia>8 


(7.31) 


Using this result to calculate the complex amplitude of the dipole moment as p m = qx (lh and then the 
electric polarization P co = np 0) of a dilute medium with n independent oscillators for unit volume, for its 
frequency-dependent permittivity (27) we get 

L— (7.32) 

m (co 0 -cd )-2 icoo 

This result may be readily generalized to the case when the system has several types of 
oscillators with different eigenfrequencies: 

s(o)) in 

Lorentz 
oscillator 
model 

where f = nfn is the fraction of oscillators with eigenfrequency <x>j, so that the sum of all f equals 1. 
Figure 5 shows a typical behavior of the real and imaginary parts of the complex dielectric constant, 
described by Eq. (33), as functions of frequency. The effect of oscillator resonances is clearly visible, 
and dominates the media response at co « C 0 j, especially in the case of low damping, 8 t « coj. Note that in 
the low-damping limit, the imaginary part of the dielectric constant s”, and hence the wave attenuation 
k”, are negligibly small at all frequencies besides small vicinities of frequencies coj, where derivative 
d£\co)ldco is negative. 13 Thus, for a system of for weakly-damped oscillators, Eq. (33) may be 
approximated, at most frequencies, as a sum of odd singularities (“poles”): 


s(co) = £ 0 +n^-y 

in , 


fj 

(co 2 - co 2 ) -2icoS 


(7.33) 


these effects are discussed in more detail in the quantum-mechanics part of my lecture notes (QM Sec. 2.1), 
because they are the crucial factor of Schrbdinger’s wave mechanics. (See also CM Sec. 5.3.) 

11 This example is focused on the frequency dependence of s, because electromagnetic waves interact with 
“usual” media via their electric field much more than via the magnetic field. However, as will be discussed in Sec. 
7, forgetting about the possible dispersion of p(cd) might result in missing some remarkable opportunities for 
manipulating the waves. 

12 If this point is not absolutely clear, please see CM Sec. 3.1. 

13 In optics, such behavior is called the anomalous dispersion. 
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s(co) ~ s 0 + n 



fj 

COj - CD 


for 8 j « | co - co j | « | co j - COj, 


(7.34) 


This result is especially important because, according to quantum mechanics, 14 Eq. (34) is also 
valid for a set of non-interacting, similar quantum systems (whose dynamics may be completely 
different from that of a harmonic oscillator!), provided that coj are replaced with frequencies of possible 
quantum interstate transitions, and coefficients f are replaced with the so-called oscillator strengths of 
the transitions - which obey the same sum rule, ff = 1. 



Fig. 7.5. Typical frequency 
dependence of the real and 
imaginary parts of the electric 
permittivity of a media in the 
Lorentz oscillator model. 


At co — » 0, the imaginary part of the permittivity also vanishes (for any Sj), while its real part 
approaches its electrostatic (“dc”) value 


s(0) = s 0 +q 2 Y J 

j 


m j(0 2 j 


(7.35) 


Note that according to Eq. (30), the denominator in Eq. (35) is just the effective spring constant /q = 
nijco/ of the / h oscillator, so that the oscillator masses nij as such are actually (and quite naturally) not 
involved in the static dielectric response. 


In the opposite limit co» C 0 j, 8j, permittivity (33) also becomes real, and may be presented as 



( 2 ^ 


= s 0 

>-% 

, where of =— V— E. 


l ® J 

T m j 


(7.36) 


e(co) in 
plasma 


The last result is very important, because it is also valid at all frequencies if all a>j and Sj vanish, 
i.e. for a gas of free charged particles, in particular for plasmas - ionized atomic gases, with negligible 
collision effects. (This is why the parameter co p defined by Eq. (36) is called the plasma frequency.) 
Typically, the plasma as a whole is neutral, i.e. the density n of positive atomic ions is equal to that of 


14 See, e.g., QM Chapters 5 and 9. 
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the free electrons. Since the ratio Hj/m, for electrons is much higher than that for ions, the general 
formula (36) for the plasma frequency is usually well approximated by the following simple expression: 


2 ne 2 
co p = . 

£o m e 


(7.37) 


2 2 

This expression has a simple physical sense: the effective spring constant rc e f = m e co p = ne /s 0 
describes the Coulomb force that appears when the electron subsystem of a plasma is shifted, as a 
whole, from its positive-ion subsystem, thus violating the electroneutrality. Indeed, consider such a 
small shift, Ax, perpendicular to the plane surface of a broad, plane slab filled with plasma. The 
uncompensated charges, with equal and opposite surface densities cr = Ten Ax, that appear at the slab 
surfaces, create inside the it, according to Eq. (2.3), a unifonn electric field E x = enAx/s 0 . This field 
exerts force eE = (ne 2 / £q) Ax on each positively charged ion. According to the 3 ld Newton law, the ions 
pull each electron back to its equilibrium position with the equal and opposite force F = -eE = - (ne /&])) 
Ax, justifying the above expression for K e f. Hence it is not surprising that Aco) described by the first of 
Eqs. (36) turns into zero at co = co p : at this resonance frequency, finite free oscillations of charge (and 
hence of D = sE) do not require a finite force (and hence E). 

The behavior of electromagnetic waves in a medium that obeys Eq. (36), is very remarkable. If 
the wave frequency <x> is above eo p , the dielectric constant and hence the wave number (28) are positive 
and real, and waves propagate without attenuation, following the dispersion relation, 

Plasma 
dispersion 
relation 

which is shown in Fig. 6. (As we will see later in this chapter, many wave transmission systems obey 
such dispersion law as well.) 



(7.38) 



Fig. 7.6. Plasma dispersion law (solid line) in 
comparison with the linear dispersion of the 
free space (dashed line). 


At co — » (Op the wave number k tends to zero. Beyond that point (at co < co p ), we still can use Eq. 
(38), but it is more instrumental to rewrite it in the mathematically equivalent form 


k(co ) = — (i 


-\o) 2 p -cd 2 


) 12 = — , where 8 = 7 

3 l 


2 2 

CO p — CO 


J/2 • 


(7.39) 


According to Eq. (29), this means that the electromagnetic field exponentially decreases with distance: 
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/ = Re/yM = exp j- A Re f,o e it0t - 


(7.40) 


Does this mean that the wave is being absorbed in the plasma? Answering this question is a good 
pretext to calculate the time average of the Poynting vector S = ExH of a monochromatic 
electromagnetic wave in an arbitrary dispersive (but still linear!) medium. First, let us spell out fields’ 
time dependence: 


E(t) = RdEJz)e 


-icot 


1 

2 L 


77 i cot . 
he + C.C. 


H(t) = Re[H a (z)e 


- icot 


-e ia>t + c.c. 


Z(a>) 


. (7.41) 


Now, a straightforward calculation yields 15 

EE 


S = E(t)H(t) = - 


1 1 
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(7.42) 


Let us apply this important general formula to our simple model of plasma at co< co p . In this case 
q{(o) = juo, i.e. is positive and real, while fto) is real and negative, so that 1 /Z(cd) = [fed)/ /r(<u)] is 
purely imaginary, and the average Poynting vector (42) vanishes. This means that energy, on the 
average, does not flow along axis z - as it would if it was absorbed in plasma. As we will see in the next 
section, waves with co< a> p are rather reflected from plasma’s boundary, without energy loss. Note that 
in the limit co« co p , Eq. (39) yields 


f 2 


<7^ — = 
co„ 


ne 


, 1/2 


, 1/2 


111 „ 


jU 0 ne 


(7.43) 


But this is just a particular case (for q = e and // = // 0 ) of the expression (6.38) that we have derived for 
the depth of magnetic field penetration into a lossless (collision-free) conductor in the quasistatic 
approximation. We see again that, as was already discussed in Sec. 6.7, that approximation (in which we 
neglect the displacement currents) gives an adequate description of the time-dependent phenomena at co 
« (Op, i.e. at 8 «cla>= \/k= XI 2 k. 

There are two most important examples of plasmas. For the Earth’s ionosphere, i.e. the upper 
part of the atmosphere that is almost completely ionized by the UV and X-ray components of Sun’s 
radiation, the maximum value of n, reached at about 300 km over the Earth surface, is between 10 10 and 
10 in’ (depending on the time of the day and Sun’s activity), so that that the maximum plasma 
frequency (37) is between 1 and 10 MHz. This is much higher than the particle’s reciprocal collision 
time f\ so that Eq. (36) gives a very good description of plasma’s electric polarization. The effect of 
reflection of waves with co < a> p from the ionosphere enables long-range (over-the-globe) radio 
communications and broadcasting at the so-called short waves, with frequencies of the order of 10 MHz. 


15 For an arbitrary plane wave the total average power flow may be calculated as an integral of Eq. (42) over all 
frequencies. By the way, combining this integral and the Poynting theorem (6.103), one can also prove the 
following interesting expression for the average electromagnetic energy density in an arbitrary dispersive (but 
linear and isotropic) medium: 




d(cos) E E * | djcoju) H H * 
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dco 
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Such waves may propagate in the flat channel formed by the Earth surface and the ionosphere, reflected 
repeatedly by these “walls”. Unfortunately, due to the random variations of Sun’s activity, and hence co p , 
such natural communication channel is not too reliable, and in our age of fiber optics cables its practical 
importance is diminishing. 

Another important example of plasmas is free electrons in metals and other conductors. For a 
typical metal, n is of the order of 10 23 cm' 3 = 10 29 m' 3 , so that Eq. (37) yields oo p ~ 10 16 s' 1 . Note that this 
value of co p is somewhat higher than mid-optical frequencies ( co ~ 3xl0 15 s' 1 ). This explains why planar, 
even, clean metallic surfaces, such as aluminum and silver films used in mirrors, are so shiny: at these 
frequencies the permittivity is almost exactly real and negative, leading to light reflection, with very 
little absorption. However, the considered model, which neglects electron scattering, becomes 
inadequate at lower frequencies, cor~ 1 . 

A phenomenological way of extending the model by account of scattering is to take, in Eq. (33), 
the lowest eigenfrequency C 0 j to be equal zero (to describe free electrons), while keeping the damping 
coefficient 8$ of this mode finite, to account for their energy loss due to scattering. Then Eq. (33) is 
reduced to 


^ef(®) = ^opt(®) + 


» off 2 1 

m -co 2 - 2ico8 0 


= A, P t(®) + 


i AN 2 
co 2 S 0 m 1 


_J 

ico/28 0 ’ 


(7.44) 


where response £ opt (co) at high (in practice, optical) frequencies is still given by Eq. (33), but now with j 
* 0 . 

Result (44) allows for a simple interpretation. To show that, let us incorporate into our 
calculations the Ohmic conduction, generalizing Eq. (4.7) as j® = o{co) E® to account for the possible 
frequency dependence of the Ohmic conductivity. Plugging this relation into the Fourier image of the 
relevant Maxwell equation, VxH ffl = j® - ztuD® = j® - icos(co) E®, we get 


V x H 0 = . (7.45) 

This relation shows that for a sinusoidal process, the addition of the Ohmic current density j® to the 
displacement current density is equivalent to addition of o(co) to -icocico), i.e. to the following change of 
the ac electric permittivity: 16 

+ • (7-46) 

co 

Now the comparison of Eqs. (44) and (46) shows that they coincide if we take 

Generalized 
Drude 
formula 

where the dc conductivity o(0) is described by the Drude fonnula (4.13), and the phenomenologically 
introduced coefficient 8o is associated with 1/2 r. Relation (47), which is frequently called the 



(7.47) 


16 Alternatively, according to Eq. (45), it is possible (and in infrared spectroscopy, conventional) to attribute the 
ac response of a medium at all frequencies to effective complex conductivity cr ef (co) = o(co) - icoclco) = -icoc ct (co). 


Chapter 7 


Page 13 of 66 


Essential Graduate Physics 


EM: Classical Electrodynamics 


generalized (or “ac”, or “rf”) Drude formula, 11 gives a very reasonable (semi-quantitative) description 
of the ac conductivity of many metals almost all the way up to optical frequencies. 


7.3. Kramers-Kronig relations 

The results for the simple model of dispersion, discussed in the last section, imply that the 
frequency dependences of the real (s’) and imaginary (s”) parts of the pennittivity are not quite 
independent. For example, let us have one more look at the resonance peaks in Fig. 5. Each time the 
real part drops with frequency, ds’ldco< 0, its imaginary part s” has a positive peak. R. de L. Kronig in 
1926 and H. A. Kramers in 1927 independently showed that this is not an occasional coincidence 
pertinent only to the Lorentz oscillator model. Moreover, the full knowledge of function s\cd) allows 
one to calculate function s”(cd), and vice versa. The reason is that both these functions are always 
related to a single real function G(6) by Eqs. (28). 

To derive the Kramers-Kronig relations, let us consider Eq. (27) on the complex frequency 
plane, &> — > co’ + ia>”: 

oo co 

f(a>) = s(co) - s 0 = J G(0)e iw9 dd = J G(6)e ia ' e e~ a " 6 d6. (7.48) 

0 0 


For all stable physical systems, G(9) has to be finite for all important values of the integration variable 
(o > 0), and tend to zero at 6 — > 0 and 6 — > oo. Because of that, and thanks to factor e co the expression 
under the integral tends to zero at \a\ — > oo in all upper half-plane (co” > 0). As a result, we may claim 
that the complex -variable functionary) is analytical in that half-plane, and allows us to apply to it the 
Cauchy integral formula 18 


f(co)=^m) 


dd 
Q -co 


(7.49) 


with the integration contour of the form shown in Fig. 7, with radius R of the larger semicircle tending 
to infinity, and radius r that of the smaller semicircle (about the singular point Q = oo) tending to zero. 



Fig. 7.7. Integration path C used in the 
Cauchy integral formula to derive the 
Kramers-Kronig dispersion relations. 


17 It may be also derived from the Boltzmann kinetic equation in the so-called relaxation- time approximation 
(RTA) - see, e.g., SM Sec. 6.2. 

18 See, e.g., MA Eq. (15.2). 
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Due to the exponential decay of |/[Q)| at If) I — > oo, the contribution to the integral from the 
larger semicircle vanishes, 19 while the contribution from the small semicircle, where Q = co + rcxp {/7/d, 
with -n< cp < 0, is 


lim 


1 


dO f (co) r ir exp {i(p}d(p f(co ) 


r— >0 


2ni , 


m) 


( A=oj+rcx\)\i(p) 

As a result, for our contour C, Eq. (49) yields 

f(co) = lim r ^ 0 

2m 


°L f 

■7 J 


f) - co 2m J T rcxp\i<p\ 2n 


j dtp = ~ f ((d)- (7.50) 


7 1 1 dQ 1 - ^ 


\ -oo co+r 


(7.51) 


Such an integral, excluding a symmetric infinitesimal vicinity of the pole singularity, is called the 
principal value of the (formally, diverging) integral from -oo to +oo, and is denoted by letter P before it. 20 
Using this notation, subtracting j[co)/2 from both parts of Eq. (48), and multiplying them by 2, we get 

i ^°° 

/(©) = -Pf/(n)- . (7.52) 

m J O -co 

-00 


Now plugging into this complex equality the polarization-related difference j{co) = a{oj) - so in 
the form [s’(co) - So] + i\s’\cd)\, and requiring both real and imaginary components of both parts of Eq. 
(52) to be equal separately, we get the famous Kramers -Kronig dispersion relations 


1 +co //O 1 +co 

s\co) = + -P f *"(<>) 5 s"(co) = — P f [s'iO) - e 0 ] 

nr f 1- co nr 


dO 

'f 1- co 


(7.53) 


Kramers- 

Kronig Now we may use the already mentioned fact that s ’( co) is always an even, while s’\cd) an odd function 
dl relations °f frequency, to rewrite these relations in the following form 


s'(a) = s 0 +- P f s"(0) ^ 2 , s"{cd) = - — P f [s'(0) - s 0 ] 
nr f - co n i 


2co 

n 


dO 


'O--C0 2 


(7.54) 


which is more convenient for most applications, because it involves only physical (positive) frequencies. 


Though the Kramers-Kronig relations are “global” in frequency, in certain cases they allow an 
approximate calculation of dispersion from experimental data for absorption, collected even in a limited 
frequency range. For example, if a medium has a sharp absorption peak at some frequency a>j, we may 
approximate it as 


s”(co ) « c8{co - co j ) + a more smooth function of co , 


(7.55) 


and the first of Eqs. (54) immediately gives 


19 Strictly speaking, this also requires J/(Q)| to decrease faster than £T at the real axis (at Q” = 0), but due to 
nonvanishing inertia of charged particles, this requirement is fulfilled for all realistic models of dispersion - see, 
e.g., Eq. (36). 

20 I am typesetting this symbol in a Roman font, to exclude any possibility of its confusion with media’s 
polarization. 
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s'(co) ~ £ 0 + 


2c Q), 
n (Oj - or 


+ another smooth function of eo , 


(7.56) 


thus predicting the anomalous dispersion near such a point. This calculation shows that such behavior 
observed in the Lorentz oscillator model (Fig. 5) is by no means occasional or model-specific. 

Let me emphasize again that the general, and hence very powerful Kramers-Kronig relations 
hinge on the causal, linear relation (21) between polarization P(t) with the electric field E(t’), but not on 
much else. This is why such relations are also valid for similar causal relations in other fields of 
physics. 21 


7.4, Reflection 

The most important new effect arising in nonuniform media is wave reflection. Let us start its 
discussion from the simplest case of a plane electromagnetic wave that is normally incident on an 
interface between two uniform, linear, isotropic media. 

If the interface is an ideal mirror, the description of reflection is very simple. Indeed, let us 
assume that one of the two media (say, located at z > 0, see Fig. 8) cannot sustain any electric field at 
all: 


E 


z> 0 


= 0 . 


(7.57) 


This condition is evidently incompatible with the single traveling wave (5). However, this solution may 
be readily corrected using the fact that the dispersion-free ID wave equation, 


r 


V 



1 8 


2 A 


Sr 


E = 0. 


(7.58) 


supports waves, propagating, with the same speed, in opposite directions. As a result, the following 
linear superposition of two such waves, 

E \ Z<0 = f( z ~ vt ) - f(~ z ~ vt ) > < 7 - 59 ) 


21 In this context, it is important to remember that a simply-looking relation between Fourier amplitudes of certain 
variables, such as D ffl = fco) E ffl , still does not imply the causal relationship between them. This means that the 
Kramers-Kronig relations are not necessarily valid for either functions fed) and fJ.(cd), or their reciprocals, of an 
arbitrary medium. Indeed, since any Green’s function describing a causal relationship has to tend to zero at small 
times 9 =t— t’ (because no system may responds to an external force instantly), its Fourier image has to tend to 
zero at co — > ± co. This is certainly true, for example, for function fed) = f ed) - so given by Eq. (32) describing a 
dilute electric medium, but not for its inverse 1 /fed) cc (of - of ) - 2 iSco, which diverges at large frequencies. As 
another example, since in a dilute linear medium the magnetic response should be due to a causal relation between 
the average magnetic field B (cause) and magnetization M (effect), whose Fourier images are related as M ffl = 
X m (cd) H f „= [l///o - 1 ///( eo)]\S „> , the Kramers-Kronig relations may be expected to be valid for function /’( of) = l/// 0 
- 1 / /u(ed), but not for jn( of) or even [//( of) - fif Unfortunately, magnetic susceptibility dispersion studies were 
started just recently, mostly in the context of the negative refractivity effects - see Sec. 5 below, and I am not 
aware of any convincing discussion of this issue even in research literature (leave alone textbooks :-). 


Dispersion 
near an 
absorption 
line 


Chapter 7 


Page 16 of 66 


Essential Graduate Physics 


EM: Classical Electrodynamics 


Wave’s 

total 

reflection 


satisfies both the equation and the boundary condition (57), for an arbitrary function / The second term 
in Eq. (59) may be interpreted as the total reflection of the incident wave described by its first tenn, in 
this case with the change of electric field’s sign. By the way, since vector n of the reflected wave is 
opposite to that incident one (see arrows in Fig. 1), Eq. (6) shows that the magnetic field of the wave 
does not change its sign at the reflection: 

H\ : , 0 =^[f(z-vt) + n-z-vt)\. (7.60) 



Fig. 7.8. Spatial dependence of electric field at 
the reflection of a sinusoidal wave from a 
perfect conductor: the real pattern (red lines) 
and the crude, ideal-mirror approximation (blue 
lines). Dashed lines show the patterns after a 
half-period time delay (coAt = n). 


Blue lines in Fig. 8 show the resulting pattern (59) for the simplest, sinusoidal waveform 


E 

z<0 

P i{kz-cot) p i{-kz-cot) 
f co e 



(7.61a) 


Depending on convenience in a particular context, this pattern may be legitimately interpreted either as 
a superposition (61a) of two traveling waves or a single standing wave, 

is | _< 0 = ~2 Im [E co e~ l mt )sin kz = 2 R e(iE [0 e ~ l mt )sin kz , (7.61b) 

in which the electric and magnetic field oscillate with the phase shifts by n!2 both in time and space: 

cos kz . (7.62) 

As the result of this shift, the time average of the Poynting vector’s magnitude, 

S(z,t) = EH = — Re^e -2 '"] sin2fe , (7.63) 

z 


H Lo = Re 


K 

z 


f(kz-ojt) i(—kz—wt ) 

IT 6, 


= 2Re 

f F \ 

^ CO e ~iCOt 


U J 


equals zero, showing that at the total reflection there is no average power flow. (This is natural, because 
the perfect mirror can neither transmit the wave nor absorb it.) Flowever, Eq. (63) shows that the 
standing wave provides local oscillations of energy, transferring it periodically between the 
concentrations of the electric and magnetic fields, separated by distance A z = nllk = HA. 

For the case of the sinusoidal waves, the reflection effects may be readily explored even for the 
more general case of dispersive and/or lossy media in which fed) and //(tv), and hence the wave vector 
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k(co) and wave impedance Z(co), defined by Eqs. (28), are certain complex functions of frequency. The 
“only” new factors we have to account for is that in this case the reflection may not be full, and that 
inside the second media we have to use the traveling-wave solution as well. Both these factors may be 
taken care of by looking for the solution of our boundary problem in the form 


E 

z< 0 — 


e ik ~ z +Re~ ik - z )e- icot 

, E 

:>0 ~ 

E Je ,k+Z e~ iat 



(7.64) 


Wave’s 

partial 

reflection 


and hence, according to Eq. (6), 


H L<o = Re 


E °> (e ik ~ z Re~ ik - z )e- iwt 

, 77 _ >0 = Re 

i 

.1 

T 

N 

1 

[_ z_(coy ’ 


|_z + (©) 


(7.65) 


(Indices + and - correspond to, respectively, the media at z > 0 and z < 0.) Please note the following 
important features of these relations: 

(i) Due to the problem linearity, we could (and did :-) take the complex amplitudes of the 
reflected and transmitted wave proportional to that (E 0J ) of the incident wave, describing them by the 
dimensionless coefficients R and T. The total reflection from an ideal mirror, that was discussed above, 
corresponds to the particular case R = -1 and T= 0. 

(ii) Since the incident wave, that we are considering, arrives from one side only (from z = - oo), 
there is no need to include a term proportional to exp{-/7+z} into Eqs. (64)-(65) - in our current problem. 
However, we would need such a term if the medium at z > 0 was non-uniform (e.g., had at least one 
more interface or any other inhomogeneity), because the wave reflected from that additional 
inhomogeneity would be incident on our interface (located at z = 0) from the right. 

(iii) Solution (64)-(65) is sufficient even for the description of the cases when waves cannot 
propagate at z > 0, for example a conductor or a plasma with co p > co. Indeed, the exponential drop of the 
field amplitude at z > 0 in such cases is automatically described by the imaginary part of wave number 
k+ - see Eq. (29). 

In order to find coefficients R and T, we need to use boundary conditions at z = 0. Since the 
reflection does not change the transverse character of the partial waves, at the normal incidence both 
vectors E and H remain tangential to the interface plane (in our notation, z = 0). Reviewing the 
arguments that has led us, in statics, to boundary conditions (3.47) and (5.118) for these components, we 
see that they remain valid for the time-dependent situation as well, 22 so that for our current case of 
purely transverse waves we can write: 

EL-o=E\:= + o, H\ q =H\ + q . (7.66) 


Plugging Eqs. (64)-(65) into these conditions, we get 


1 + R = T, 


-L(l -r) = —T. 

Z Z + 


(7.67) 


22 For example, the first of conditions (66) may be obtained by integrating the full (time-dependent) Maxwell 
equation VxE + (Ml at = 0 over a narrow and long rectangular contour with dimensions / and d (d « /) stretched 
along the interface. In the Stokes theorem, the first term gives A EJ, which the contribution of the second term is 
proportional to product dl and vanishes as dll — > 0. The proof of the second boundary condition is similar - as was 
already discussed in Sec. 6.2. 
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Solving this simple system of equations, we get 23 

Reflection 
and 

transmission 
at a sharp 
interface 

These formulas are very important, and much more general than one may think, because they are 
applicable for virtually any ID waves - electromagnetic or not, if only the impedance Z is defined in a 
proper way. 24 Since in the general case the wave impedances Z+, defined by Eq. (28) with the 
corresponding indices, are complex functions of frequency, Eqs. (68) show that coefficients R and T 
may have imaginary parts as well. This fact has most important consequences at z < 0 where the 
reflected wave, proportional to R, interferes with the incident wave. Indeed, plugging R = \ R \ e‘ fp 
(where (p = arg R is a real phase shift) into the expression in parentheses in the first of Eqs. (64), we 
may rewrite it as 

(e ik ~ z +Re- ik - z )=(l-\R\ + \R\)e ik - z + \R\e i<p e~ ik ~ z 

= (l - |f?|)e , ^ _z +2\R\e l<p/2 sin[k_(z -£_)], where 8_= — — — . 

2k_ 

This means that the field may be presented as a sum of a traveling wave and a standing wave, with 
amplitude proportional to I R \ , shifted by distance 8. toward the interface, relatively to the ideal-mirror 
pattern (61b). This effect is frequently used for the experimental measurements of an unknown 
impedance Z+ of some medium, provided than Z is known (e.g., for the free space, Z. = Zo). For that, a 
small antenna (the probe), not disturbing the field distribution too much, is placed into the wave field, 
and the amplitude of the ac voltage induced in it by the wave in the probe is measured by some detector 
(e.g., a semiconductor diode with a quadratic I-V curve), as a function of z (Fig. 9). From this 
measurement, it is straightforward to find both I R I and 8., and hence restore complex R, and then use 
Eq. (68) to calculate both modulus and argument of Z+. 25 



(7.68) 



V ccE\z,t) 


Fig. 7.9. Measurement of the complex 
impedance of a medium (schematically). 


Now let us discuss what do these results give for waves incident from the free space (Z_( co) = Z 0 
= const, k. = ko = co/c) onto the surface of two particular media. 


23 Please note that only the media impedances (rather than wave velocities) are important for the reflection in this 
case! Unfortunately, this fact is not clearly emphasized in some textbooks that discuss only the case p± = //<>, when 
Z = (//(/ £) 1/2 and v = I l{p {] e ) 1 2 are proportional to each other. 

24 See, e.g., the discussion of elastic waves of mechanical deformations in CM Secs. 5.3, 5.4, 7.7, and 7.8. 

25 Before the advent of computers, specially lined paper (called the Smith chart) was commercially available for 
performing this recalculation graphically; it is occasionally used even nowadays for result presentation. 
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(i) For a collision- free plasma (with negligible magnetization) we may use Eq. (36) with //( co) = 
jUo, to present the impedance in either of two equivalent fonns: 


Z = Z fi 


co 




n ~ 


co 


( 2 2V /2 ‘ 

{(O p -co ) 


(7.70) 


The former expression is more convenient in the case co > co p , when the wave vector k+ and the wave 
impedance Z+ of plasma are real, so that a part of the incident wave propagates into the plasma. 
Plugging this expression into the latter of Eqs. (68), we see that the transmission coefficient is real: 


T = 


2 co 


I 2 2) 1/2 ‘ 

CO + \60~ -CO p ) 


(7.71) 


Note that according to this formula, somewhat counter-intuitively, T > I for any frequency 
(above co p ). How can the transmitted wave be more intensive than the incident one that has induced it? 
For a better understanding of this result, let us compare the powers (rather than amplitudes) of these two 
waves, i.e. their average Poynting vectors (42): 


* incident 


2Z„ 


— TE n 
S + = — - 
2Z 


2Z n 


4 co{co 2 — co p )' 

0 | C0 + {c0 2 -G^) 1/- ] 


(7.72) 


It is easy to see that the ratio of these two values 26 is always below 1 (and tends to zero at co — » co p ), so 
that only a fraction of the incident wave power may be transferred. Hence the result T > 1 may be 
interpreted as follows: the interface between two media also works as an impedance transformer, though 
it can never transfer more power than the incident wave provides, i.e. can only decrease the product S = 
EH, but since the ratio Z = E/H changes at the interface, the amplitude of one of the fields may increase 
at the transfer. 


Now let us proceed to case co< C0p, when the waves cannot propagate in the plasma. In this case, 
the latter of expressions (70) is more convenient, because it immediately shows that Z+ is purely 
imaginary, while Z. = Zo is purely real. This means that (Z+ - Z.) = (Z+ + Z.)*, i.e. according to the first 
of Eqs. (68), \R\ = 1, so that the reflection is total, i.e. no incident power (on the average) is transferred 
into the plasma - as was already discussed in Sec. 2. However, the complex R has a finite argument, 


<P = arg R = 2 arg(Z + - Z 0 ) = -2 arctan-^ — - , 

K-® ) " 

and hence provides a finite spatial shift (69) of the standing wave toward the plasma surface: 


(7.73) 


8 


cp-n 


= — arctan-^ — 

® K 


CO 



(7.74) 


On the other hand, we already kn ow from Eq. (40) that the solution at z > 0 is exponential, with 
the decay length 8 that is described by Eq. (39). Calculating, from coefficient T, the exact coefficient 
before this exponent, it is straightforward to verify that the electric and magnetic fields are indeed 


26 This ratio is sometimes also called the transmission coefficient, but in order to avoid its confusion with T, it is 
better to call it the power transmission coefficient. 
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continuous at the interface, forming the pattern shown by red lines in Fig. 8. This penetration may be 
experimentally observed, for example, by bringing close to the interface the surface of another material 
transparent as frequency <x>. Even without solving this problem exactly, it is evident that if the distance 
between these two interfaces becomes comparable to 8, a part of the exponential “tail” of the field is 
picked up by the second material, and induces a propagating wave. This is an electromagnetic analog of 
the quantum-mechanical tunneling through a potential barrier. 27 


Note that at <x>« co p , both 8- and 8 arc reduced to the same frequency-independent value, 
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( 2 A 

c £ 0 m e 

1/2 
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m e 
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{ ne 2 J 


{R 0 ne 2 J 


1/2 


(7.75) 


which is just the field penetration depth £>(6.38) calculated for a perfect conductor model (assuming m = 
m e and // = //o) in the quasistatic limit. This is natural, because the condition co « ca p may be recast as Aq 
= 2kcI(o » 2 ml (Op = 2nd. 

(ii) Now let us consider electromagnetic wave reflection from a nonmagnetic conductor. In the 
simplest low-frequency limit, when cot is much less than 1, the conductor may be described by a 
frequency-independent conductivity cr. 28 According to Eq. (46), in this case we can take 




f V /2 

Ao 


^o P t {co) + icrl(o J 


(7.76) 


With this substitution, Eqs. (68) immediately give us all the results of interest. In particular, they show 
that now R is complex, and hence some fraction F of the incident wave is absorbed by the conductor. 
Using Eq. (42), we may calculate the fraction to be 


F = 


S 

S 


+ z =+ 0 


incident 


Z n 

Re— 

Z, 


(7.77) 


(Since power flow S+ into the conductor depends on z, tending to zero at distances z ~ 8, it is important 
to calculate it directly at the interface to account for the absorption in the whole volume of the 
conductor.) Restricting ourselves, for the sake of simplicity, to the most important quasistatic limit, i.e. 
to Z+ = (jUoOjUa) , and using Eq. (6.27) to express the impedance via the skin depth, Z+ = 
^(2//) I 2 (£> v /Zo)Zo, we see that |Z+ I « Z 0 , so that, according to Eq. (68), T « 2 Z+/Z 0 and 

Wave’s 
absorption 
in 

conductor’s 
surface 

Thus the absorbed power scales as the ratio of the skin depth to the free-space wavelength. This 
important result is widely used for the semi-qualitative evaluation of power losses in metallic 
waveguides and resonators, and immediately shows that in order to keep the losses low, the 
characteristic size of such systems (that gives a scale of the free-space wavelengths Z 0 , at which they are 



(7.78) 


27 See, e.g., QM Sec. 2.3. 

28 In a typical metal, r~ 10" 13 s, so that this approximation work well all the way up to co~ 10 13 s' 1 , i.e. up to the 
far-infrared frequencies. 
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used) should be much larger than S s . A more detailed theory of these structures will be discussed later in 
this chapter. 


7.5. Refraction 

Now let us consider the effects arising at the plane interface if the wave incidence angle 6 (Fig. 
10) is arbitrary, rather than equal to zero as in our previous analysis, for the simplest case of fully 
transparent media, with real s+ and ju±. 



Fig. 7.10. Plane wave reflection, transmission, and 
refraction at a plane interface. The plane of drawing is 
selected to contain all three wave vectors k+, k_, and k 


In contrast with the case of normal incidence, here the wave vectors k_, k. ’, and k+ of the three 
component (incident, reflected, and transmitted) waves may have different directions. Hence now we 
have to start our analysis with writing a general expression for a single plane, monochromatic wave for 
the case when its wave vector k has all 3 Cartesian components, rather than one. An evident 
generalization of Eq. (1 1) to this case is 


= Re 


fco e 


i(k x x + k y y + k z z) - cot j 


= Re 


fo, e 


/(k-r-ot) 


(7.79) 


This relation enables a ready analysis of “kinematic” relations that are independent of the media 
impedances. Indeed, it is sufficient to notice that in order to satisfy any linear, homogeneous boundary 
conditions at the interface (z = 0), all waves have the same temporal and spatial dependence on this 
plane. Hence if we select plane xz so that vector k. lies in it, then ( k.) y = 0, and k+ and k. ‘ cannot have 
any v-componcnt either, i.e. all three vectors lie in the same plane - that is selected as the plane of 
drawing of Fig. 10. Moreover, due to the same reason their x-components should be equal: 

k_ sin# = k_ sin#' = k + sin/' . (7.80) 


From here we immediately have the well-known laws of reflection 

e' = 9,_ 

and refraction: 29 


(7.81) 


Reflection 

angle 


29 This relation is traditionally called the Snell law, after a 17 th century’s author W. Snellius, though it has been 
traced back to a circa 984 manuscript by Abu Saad al-Ala ibn Sahl. 
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Snell 

law 


sin r k 

sin 0 k + 


(7.82) 


In this form, the laws are valid for plane waves of any nature. In optics, the Snell law (82) is frequently 
presented in the form 


sin/* n 
sin 0 n + ’ 


(7.83) 


where n± is the index of refraction (also called the “refractive index”) of the corresponding medium, 
defined as its wave number normalized so that of the free space (at wave’s frequency): 


Index 
of refraction 


Perhaps the most famous corollary of the Snell law is that if a wave propagates from a medium 
with a higher index of refraction to that with a lower one (i.e. if n. > n+ in Fig. 10), for example from 
water into air, there is always a certain critical value 0 C of the incidence angle, 


(7.85) 


at which angle r reaches nil. At a larger 6. i.e. within the range 6 C < 0 < nil, the boundary conditions 
cannot be satisfied with a refracted wave with a real wave vector, so that the wave experiences the so- 
called total internal reflection. This effect is very important for practice, because it shows that dielectric 
surfaces may be used as mirrors, in particular in optical fibers - to be discussed in more detail in Sec. 8 
below. This is very fortunate for all the telecommunication technology, because the light reflection from 
metals is rather imperfect. Indeed, according to Eq. (78), in the optical range (2o ~ 0.5 pm, i.e. a>~ 10 1 ' 
s' ), even the best conductors (with cr~ 6x 10 S/m and hence the normal skin depth S s ~ 1.5 nm) provide 
relatively high losses F ~ 1% at each reflection. 

Note, however, that even within the range 0 C < 0< nil the field at z > 0 is not identically equal to 
zero: just as it does at the normal incidence (6*= 0), it penetrates into the less dense media by a distance 
of the order of A>, exponentially decaying inside it. At 6 ^ 0 the penetrating field still changes 
sinusoidally, with wave number (80), along the interface. Such a field, exponentially dropping in one 
direction but still propagating as a wave in another direction, is frequently called the evanescent wave. 

One more remark: just as at the normal incidence, the field penetration into another medium 
causes a phase shift of the reflected wave - see, e.g., Eq. (69) and its discussion. A new feature of this 
phase shift, arising at 0^ 0, is that it also has a component parallel to the interface - the so-called called 
the Goos-Hanchen effect. In geometric optics, this effect leads to an image shift (relative to that its 
position in a perfect mirror) with components both normal and parallel to the interface. 

Now let us carry out an analysis of the “dynamic” relations that determine amplitudes of the 
refracted and reflected waves. For this we need to write explicitly the boundary conditions at the 
interface (i.e. plane z = 0). Since now the electric and/or magnetic fields may have components normal 
to the plane, in addition to the continuity of their tangential components, which we have repeatedly 
discussed, 


Critical 


r \ 

£ + M* 

1/2 

angle 

n 

V s -0-) 




(7.84) 
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we also need relations for the normal components. As it follows from the homogeneous macroscopic 
Maxwell equations (6.94b), they are also the same as in statics (D„ = const, B„ = const), for our 
reference frame choice (Fig. 10) giving 


The expressions of these components via amplitudes E 0J , RE m and TE C0 of the incident, reflected 
and transmitted waves depend on the incident wave’s polarization. For example, for a linearly -polarized 
wave with the electric field vector perpendicular to the plane of incidence (Fig. 1 la), i.e. parallel to the 
interface plane, the reflected and refracted waves are similarly polarized. 



Fig. 7.11. Reflection and refraction at two different linear polarizations of the incident wave. 


As a result, all E z are equal to zero (so that the first of Eqs. (87) is inconsequential), while the 
tangential components of the electric field are just equal to their full amplitudes, just as at the nonnal 
incidence, so we still can use Eqs. (64) to express these components via coefficients R and T. However, 
at 6 ^ 0 the magnetic fields have not only tangential components 


H x z= _ 0 =Re — R)cosOe 


,= Re ^Tcosr e~ iwt 
Z 


but also normal components (Fig. 1 la): 


H z _ 0 =Re !^(1 + A)sin£ e~ ia * , H : _ 0 =Re ^Tsmr e~ l0Jt 


Plugging these expressions into the boundary conditions expressed by Eqs. (86) (in this case, for 
y components only) and the second of Eqs. (87), we get three equations for two unknown coefficients R 
and T. However, two of these equations duplicate each other because of the Snell law, and we get just 
two independent equations, 
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1 + R = T, 


-^-(l -7?)cos# = -^— F cos r, 


(7.90) 


which are a very natural generalization of Eqs. (67), with replacements Z. — > Z.cosr, Z+ — > Z+cos# As a 
result, we can immediately use Eq. (68) to write the solution of system (90): 30 

= Z> cos g -Z- cosy T = 2 Z t . ( 7 . 91 a) 

Z + cos# + Z_ cosr Z + cos# + Z_ cosr 


If we want to express the coefficients via the angle of incidence alone, we should use the Snell 
law (82) to eliminate angle r, getting 


Z + cos#-Z_ 

l-(k / k + ) 2 sin 2 # 

2 Z cos# 

T — + 

Z + cos# + Z_ 

1 - (k / k + ) 2 sin 2 6 

1/2 ? 

Z + cos# + Z 

1 - (k / k + ) 2 sin 2 #] 

1/2 


(7.91b) 


However, my strong preference is to use the kinematic relation (82) and dynamic relations (91a) 
separately, because Eq. (91b) obscures the very important physical fact that and the ratio of k± , i.e. of 
the wave velocities of the two media, is only involved in the Snell law (79), while the dynamic relations 
essentially include only the ratio of wave impedances - just as in the case of normal incidence. 


In the opposite case of the linear polarization of the electric field within the plane of incidence 
(Fig. lib), it is the magnetic field that does not have a normal component, so it is now the second of 
Eqs. (87) that does not participate in the solution. However, now the electric fields in two media have 
not only tangential components, 


,= Re 


E m (l + R) cos 0 e 


- imt 


z =+0 


= Re 


ET cos r e 


-imt 


(7.92) 


but also normal components (Fig. 1 lb): 

E -. | _-=-o = E co (-1 + R) sin #, E z | __ =+0 = -EJ sin r. 

As a result, instead of Eqs. (90), the reflection and transmission coefficients are related as 

(1 + R) cos 6 = T cos r, — (l -R) = —T. 

Z_ Z + 


(7.93) 


(7.94) 


Again, the solution of this system may be immediately written using the analogy with Eq. (67): 


Z + cos r-Z cos# 
Z + cos r + Z cos 6 ’ 


T = 


2 Z + cos# 

Z + COS T + Z cos# ’ 


(7.95a) 


or, alternatively, using the Snell law: 



\-{k / k + ) 2 sin 2 # 

1/2 -Z cos 0 _ 

T — 

2Z + cos# 



1 - (k / k + ) 2 sin 2 # 

1/2 9 I 

+ Z_cos# Z + 

1 - (k / k + ) 2 sin 2 #] 

1/2 ^ 

+ Z cos# 


(7.95b) 


30 Note that we may calculate the reflection and transmission coefficients R ’ and T’ for the wave traveling in the 
opposite direction just by making parameter swaps Z+ <-> Z. and 0 <-» r, and that the resulting coefficients satisfy 
the following Stokes relations : R’ = -R, and R 2 + 77” = 1, for any Z+. 
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1/9 

For the particular case fu= /u. = //o, when Z+/Z- = {£?£+) = k./k+ = nJn+ (which is approximately 

correct for traditional optical media), Eqs. (91b) and (95b) are called the Fresnel formulas? 1 Most 
textbooks are quick to point out that there is a major difference between these cases: while for the 
electric field polarization within the plane of incidence (Fig. lib), the reflected wave amplitude 
(proportional to coefficient R ) turns to zero at a special value of 0 (the so-called Brewster angle)? 2 

0 B = arctan— , (7.96) 

n 


while there is no such angle in the opposite case (Fig. 1 la). 33 However, that this statement, as well as 
Eq. (96), is true only for the case //+ = //.. In the general case of different e and //, Eqs. (91) and (95) 
show that the reflected wave vanishes at 0= 0q with 


e_M + ~£ + M „ f C“ + // Q, for E In. (Fig. 11a), 
£ + H + -s /a \{-s + ls ), for H In. (Fig. lib). 


(7.97) 


Brewster 

angle 


Note the natural £ fi symmetry of these relations, resulting from the E <-» H symmetry for 
these two polarization cases (Fig. 11). They also show that for any set of parameters of the two media 
(with £+, /u± > 0), tan Ob is positive (and hence a real Brewster angle Ob exists) only for one of these two 
polarizations. In particular, if the interface is due to the change of ju alone (i.e. £+ = £.), the first of Eqs. 
(97) is reduced to the simple form (96) again, while for the polarization shown in Fig. lib there is no 
Brewster angle, i.e. the reflected wave has a nonvanishing amplitude for any 0. 


Such account of both media parameters on an equal footing is especially necessary to describe 
the so-called negative refraction effects. 34 As was shown in Sec. 2, in a medium with electric-field- 
driven resonances, function fed) may be almost real and negative, at least within limited frequency 
intervals - see, in particular, Eq. (34) and Fig. 5. As have already been discussed, if, at these 
frequencies, function /u{cd) is real and positive, then lc(o)) = of fed) /fed) < 0, and k may be presented as 
H 8 with real 8, meaning the exponential field decay into the medium. However, let consider the case 
when both fed) < 0 and [fad) < 0 at a certain frequency. (This is evidently possible in a medium with 
both E-driven and H-driven resonances, at proper relations between their eigenfrequencies.) Since in 
this case k~(ed) = co foS)/u(oS) > 0, the wave vector is real, so that Eq. (79) describes a traveling wave, 
and one could think that there is nothing new in this case. Not quite so! 


31 After A.-J. Fresnel (1788-1827), one of the pioneers of the wave optics, who is credited, among many other 
contributions (see in particular Ch. 8), for the concept of light as a purely transverse wave. 

32 A very simple interpretation of Eq. (93) is based on the fact that, together with the Snell law (82), it gives r + 0 
= nil. As a result, vector E + is parallel to vector k. ’, and hence oscillating dipoles of medium at z > 0 do not have 
the component which could induce the transverse electric field E. ‘ of the reflected wave. 

33 This effect is used in practice to obtain linearly polarized light, with the electric field vector perpendicular to 
the plane of incidence, from the natural light with its random polarization. An even more practical application of 
the effect is a partial reduction of undesirable glare from wet surfaces (for the water/air interface, n+/n. ~ 1.33, 
giving Ob ~ 50°) by making car light covers and sunglasses of vertically-polarizing materials. 

34 Despite some important background theoretical work by A. Schuster (1904), L. Mandelstam (1945), D. 
Sivikhin (1957), and especially V. Veselago (1966-67), the negative refractivity effects have only recently 
become a subject of intensive scientific research and engineering development. 
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First of all, for a sinusoidal, plane wave (79), operator V is equivalent to the multiplication by ik. 
As the Maxwell equations (2a) show, this means that at a fixed direction of vectors E and k, the 
simultaneous reversal of signs of a and // means the reversal of the direction of vector H. Namely, if 
both s and // are positive, these equations are satisfied with mutually orthogonal vectors E, H, and k 
forming the usual, right-hand system (see Fig. 1 and Fig. 12a), the name stemming from the popular 
“right-hand rule” used to determine the vector product direction. However, if both s and // are negative, 
the vectors form a left-hand system - see Fig. 12b. (Due to this fact, the media with s < 0 and ju< 0 are 
frequently called the left-handed materials, LHM for short.) According to Eq. (6.97), that does not 
involve media parameters, this means that for a plane wave in a left-hand material, the Poynting vector S 
= ExH, i.e. of the energy flow, is directed opposite to the wave vector k. 




Fig. 7.12. Directions of main vectors 
of a plane wave inside a medium 
with (a) positive and (b) negative s 
and p. 


This fact may seems strange, but is in no contradiction with any fundamental principle. Let me 
remind you that, according to the definition of vector k, its direction shows the direction of the phase 
velocity v ph = co/k of a sinusoidal (and hence infinitely long) wave that cannot be used, for example, for 
signaling. Such signaling (by sending wave packets - see Fig. 13) is possible with the group velocity v gr 
= dco/dk. This velocity in left-hand materials is always positive (directed along vector S). 



Fig. 7.13. Example of a wave packet 
moving along axis z with a negative 
phase velocity, but positive group 
velocity. Blue lines show a packet 
snapshot a short time interval after the 
first snapshot (red lines). 


Maybe the most fascinating effect possible with left-hand materials is the wave refraction at their 
interfaces with the usual, right-handed materials - first predicted by V. Veselago. Consider the example 
shown in Fig. 14a. In the incident wave, coming from the usual material, the directions of vectors k. and 
S. coincide, and so they are in the reflected wave characterized by vectors k and S This means that 
the electric and magnetic fields in the interface plane (z = 0) are, at our choice of coordinates, 
proportional to exp { ik x x } , with positive component k x = k . cos 6. In order to satisfy any linear boundary 
conditions, the refracted wave, going into the left-handed material, should match that dependence, i.e. 
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have a positive x-componcnt of its wave vector k+. But in this medium, this vector has to be antiparallel 
to vector S that, in turn, should be directed out of the interface, because it presents the power flow from 
the interface into the material bulk. These conditions cannot be reconciled by the refracted wave 
propagating along the usual Snell-law direction (shown by the dashed line in Fig. 13a), but are all 
satisfied at refraction in the direction given by Snell’s angle with negative sign. (Hence the tenn 
“negative refraction”). 35 




(b) 


A 


2d 


V 


Fig. 7.14. Negative refraction: (a) waves at the interface between media with positive and negative values 
of SjU, and (b) the hypothetical perfect lense: a parallel plate made of a material with s = - So and // = - // 0 . 


In order to understand how unusual the results of the negative refraction may be, let us consider 
a parallel slab of thickness d, made of a hypothetical left-handed material with e = - so, p = - //o (Fig. 
14b), placed in free space. For such a material, the refraction angle r = - 6, so that the rays from a point 
source, located at a distance a < d from the slab, propagate as shown in that figure, i.e. all meet again at 
distance a inside the plate, and then continue to propagate to the second surface of the slab. Repeating 
our discussion for this surface, we see that a point’s image is also formed beyond the plate at distance 2 a 
+ 2b = 2a + 2{d - a) = 2d from the object. Superficially, this looks like the usual lense, but the well- 
known lense formula, which relates a and b with the focal length f is not satisfied. (In particular, a 
parallel beam is not focused into a point at any finite distance.) 

As an additional difference from the usual lense, the system shown in Fig. 14b does not reflect 
any part of the incident light. Indeed, it is straightforward to check that in order for all above formulas 
for R and T to be valid, the sign of the wave impedance Z in left-handed materials has to be kept 
positive. Thus, for our particular choice of parameters (s = - So, // = - /Jo), Eqs. (91a) and (95a) are valid 
with Z+ = Z. = Z 0 and cos r = cos 6 = 1 , giving ll = 0 for any linear polarization, and hence for any other 
wave polarization - circular, elliptic, natural, etc. 

The perfect lense suggestion has triggered a wave of efforts to implement left-hand materials 
experimentally. (Attempts to found such materials in nature have failed so far.) Most progress in this 
direction has been achieved using the so-called metamaterials, which are essentially quasi-periodic 
arrays of specially designed electromagnetic resonators, ideally with high density n » X . For example, 


35 Inspired by this fact, in some publications the left-hand materials are prescribed a negative index of refraction 
77. However, this prescription should be treated with care (for example, it complies with the first form of Eq. (84), 
but not its second form), and the sign of n, in contrast to that of wave vector k, is the matter of convention. 
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Fig. 15a shows the metamaterial that was used for the first demonstration of negative refractivity in the 
microwave region, i.e. a few-GHz frequencies - see Fig. 15b. It combines straight strips of a metallic 
film, working as lumped resonators with a large electric dipole moment (hence strongly coupled to 
wave’s electric field E), and several almost-closed film loops (so-called split rings), working as lumped 
resonators with large magnetic dipole moments, coupled to field H. By designing the resonance 
frequencies close to each other, the negative refractivity may be achieved - see the black line in Fig. 
1 5b, which shows experimental data. Recently, the negative refractivity was demonstrated in the optical 
range, albeit at relatively large absorption that spoils all potentially useful features of the left-handed 
materials. 




Fig. 7.15. The first artificial 
left-hand material with 
experimentally demonstrated 
negative refraction in a 
microwave region. Adapted 
from R. Shelby et al., Science 
292,77 (2001). © AAAS. 


This progress has stimulated the development of other potential uses of metamaterials (not 
necessarily the left-handed ones), in particular designs of nonunifonn systems with engineered 
distributions sir, co) and //(r, co), which may provide electromagnetic wave propagation along the 
desired paths, e.g. around a certain region of space (Fig. 16), making it virtually invisible for an external 
observer - so far, within a limited frequency range, and a certain wave polarization only. Due to these 
restrictions, the practical value of this work on such invisibility cloaks in not yet clear (at least to this 
author); but so much attention is focused on this issue 36 that the situation should become much more 
clear in just a few years. 




Fig. 7.16. Experimental demonstration of a 
prototype 2D “invisibility cloak” in the 
microwave region. Adapted from D. Schurig 
et al.. Science 314 , 977 (2006). © AAAS. 


36 For a recent review, see, e.g., B. Wood, Comptes Rendus Physique 10 , 379 (2009). 
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7.6. Transmission lines: TEM waves 


So far, we have analyzed plane the electromagnetic waves with infinite cross-section. The cross- 
section may be limited, still sustaining wave propagation, using wave transmission lines (also called 
waveguides): cylindrically-shaped structures made of either good conductors or dielectrics. Let us first 
discuss the first option. In order to keep our analysis (relatively :-) simple, let us assume that: 

(i) the structure is a cylinder (not necessarily with a round cross-section, see Fig. 17) filled with a 
usual (right-handed), uniform dielectric material with negligible losses: s = s’ > 0, // = /a’ > 0, and 

(ii) the wave attenuation due to the skin effect is also negligibly low. (As Eq. (78) indicates, for 
that the characteristic size a of waveguide’s cross-section has to be much larger than the skin-depth i\ of 
its wall material. The effect of skin-effect losses will be analyzed in Sec. 10 below.) 

After such exclusion of attenuation, we may look for a particular solution of the Maxwell 
equations in the form of a monochromatic wave traveling along the waveguide: 


E(r,t) = Re 


E Jx,y)e 


i(k z z-cot ) 


H(r,0 = Re 


H a (x,y)e 


i(k z z-cot ) 


(7.98) 


with real k z . Note that this form allows an account for a substantial coordinate dependence of the electric 
and magnetic field in the plane {x,y} of the waveguide’s cross-section, as well as for longitudinal 
components of the fields, so that solution (98) is substantially more complex than the plane waves we 
have discussed above. We will see in a minute that as a result of this dependence, constant k z may be 
very much different from the plane -wave value ( 13), k = oi s/u) , in the same material. 


T 2 


> v 



Fig. 7.17. Decomposition of the 
electric field in a waveguide. 


In order to describe these effects explicitly, let us decompose the complex amplitudes of the 
fields into the longitudinal and transverse components (Fig. 17) 37 

E„-/ n K . = H,n, + H, . (7.99) 

Plugging Eqs. (98)-(99) into the homogeneous Maxwell equations (2), and requiring the longitudinal 
and transverse components to be balanced separately, we get 


37 Note that for the notation simplicity, I am dropping index a> in the complex amplitudes of the field components, 
and later will drop argument a> in k z and Z, though they may depend on the wave frequency rather substantially - 
see below. 
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2D Helmholtz 
equations for 
E z and H z 

Wave vector 
component 
balance 


ik : n z x E, - i(o/M t = -V, x (E z n. ), ik z n z x H, + /7w.E, = -V, x (fln : ), 

V, xE, = icojuH z n z , V t x H, = -iscoE z n z , (7.100) 

V, -E, = -ik z E z , V, H, = -ik z H z . 


where V, is the 2D Laplace operator acting in the transverse plane [x, y ] . These equations may look even 
more bulky than the original Maxwell equations, but actually are much simpler for analysis. Indeed, 
eliminating the transverse components from these equations (or, even simpler, just plugging Eq. (99) 
into Eqs. (3) and keeping just their z-components), we may get a pair of self-consistent equations for the 
longitudinal components of the fields, 38 


(Vf+t, ! )£ ; = 0, { v ';+^) h : = 0, 

where k is still defined by Eq. (13), k = (sju) m (o, and 


(7.101) 



(7.102) 


After distributions E z (x,y) and H z (x,y) have been found from these equations, they provide right-hand 
parts for rather simple, closed system of equations (100) for the transverse components of field vectors. 
Moreover, as we will see below, each of the following three types of solutions: 


(i) with E z = 0 and H z = 0 (called the transverse, or TEM waves), 


(ii) with E- = 0, but H : ^ 0 (called either TE waves or, more frequently, II modes), and 


(iii) with E z ^0, but H z = 0 ( TM waves or E modes), 

has its own dispersion law and hence wave propagation velocity; as a result, these modes (the term 
meaning the field distribution pattern) may be considered separately. 

Let us start with the simplest, TEM waves with no longitudinal components of either field. For 
them, the top two equations of system (100) immediately give Eqs. (6) and (13), and k z = k. In plain 
English, this means that E = E, and H = H, are proportional to each other and mutually perpendicular 
(just as in the plane wave) at each point of the cross-section, and that the TEM wave impedance Z = E/H 
and dispersion law co(k), and hence the propagation speed, are the same as in a plane wave in the 
material filling the waveguide. In particular, if s and // are frequency-independent within a certain 
frequency range, the dispersion law is linear, co = k/(sju) ", and wave’s speed does not depend on its 
frequency. For practical applications to telecommunications, this is a very important advantage of TEM 
waves over their TM and TE counterparts - to be discussed below. 

Unfortunately, such waves cannot propagate in every waveguide. In order to show this, let us 
have a look at the two last lines of Eqs. (100). For the TEM waves ( E z = 0, H z = 0, k z = k), they yield 


V,xE, =0, V,xH, =0, 

V,-E, =0, V, • H, = 0. 


(7.103) 


In the macroscopic approximation of the boundary conditions (i. e., neglecting the screening and skin 
depths), we have to require that the wave does not penetrate the walls, so that inside them, E = H = 0. 
Close to the wall but inside the waveguide, the normal component E„ of the electric field may be 


38 The wave equation presented in the form (98) is called the (in our particular case, 2D) Helmholtz equation, after 
H. von Helmholtz (1821-1894) - the mentor of H. Hertz and M. Planck, among many others. 
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different from zero, because surface charges may sustain its jump (see Sec. 2.1). Similarly, the 
tangential component H T of the magnetic field may have a finite jump at the surface due to skin currents. 
However, the tangential component of the electric field and the normal component of magnetic field 
cannot experience such jump, and in order to have them vanishing inside the walls they have to equal 
zero near the walls inside the waveguide as well: 

E r =0, H n = 0. (7.104) 

But the left columns of Eqs. (103) and (104) coincide with the formulation of the 2D boundary 
problem of electrostatics for the electric field induced by electric charges of the conducting walls, with 
the only difference that in our current case the value of s should be replaced with s(a>). Similarly, the 
right columns of those relations coincide with the formulation of the 2D boundary problem of 
magnetostatics for the magnetic field induced by currents in the walls, with // = ju(oj). The only 
difference is that in our current case the magnetic fields should not penetrate inside the conductors. 

Now we immediately see that in waveguides with a singly-connected wall topology (see, e.g., 
the particular example shown in Fig. 17), TEM waves are impossible, because there is no way to create 
a finite electrostatic field inside a conductor with such cross-section. Fortunately, such fields (and hence 
TEM waves) are possible in structures with cross-sections consisting of two or more disconnected (dc- 
insulated) parts - see, e.g., Fig. 18. (Such structures are more frequently called the transmission lines 
rather than waveguides, the last term being mostly reserved for the lines with singly-connected cross- 
sections of the walls.) 




Fig. 7.18. Example of the cross-section 
of a transmission line that may support 
the TEM wave propagation. 



Now we can readily derive some “global” relations for each conductor, independent on the exact 
shape of its cross-section. Indeed, consider contour C drawn very close to the conductor’s surface (see, 
e.g., the red dashed line in Fig. 18). First, we can consider it as a cross-section of a cylindrical Gaussian 
volume of certain length dz « A, =2n!k. Using the generalized Gauss law (3.29), get 

,)> = —, (7-105) 

c £ 


where X a (not to be confused with wavelength 72) is the linear density of electric charge of the 
conductor. Second, the same contour C may be used in the generalized Ampere law (5.131) to write 

f(H, )>• = /„, (7.106) 

c 
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where I 0 > is the total current flowing along the conductor (or rather its complex amplitude). But, as was 
mentioned above, in the TEM wave the ratio EJH t of the field components participating in these two 
integrals is constant and equal to Z= (ju/s) ", so that Eqs. (105)-(106) give the following simple relation 
between the “global” characteristics of the conductor: 


/ 


CO 


z 




(7.107) 


This relation may be also obtained by a different means; let me describe it, because it has an 
independent value. Let us consider a small segment dz « A = 2 ntk of the conductor (limited by the red 
dashed line in Fig. 18) and apply the electric charge conservation law (4.1) to the instant values of the 
linear charge density and current. The cancellation of dz in both parts yields 

dZ(z,t) = dl(z,t) (7 108) 

dt dz 


(If we accept the sinusoidal wavefonn, exp{/(kz - cot)}, for both these variables, we immediately recover 
Eq. (107) for their complex amplitudes, so that the result just expresses the charge continuity law. 
However, Eq. (108) is valid for any wavefonn.) 

The global equation (108) may be made more specific in the case when the frequency 
dependence of s and // is negligible, and the transmission line consists of just two isolated conductors 
(see, e.g., Fig. 18). In this case, in order to have the wave well localized in the space near the two 
conductors, we need a sufficiently fast convergence of its electric field at large distances. 39 For that, 
their linear charge densities for each value of z should be equal and opposite, and we can simply relate 
them to the potential difference V between the conductors: 

^^- = C 0 , (7.109) 

V(z,t) 

where Co is the mutual capacitance of the conductors per unit length - that was repeatedly discussed in 
Chapter 2. Then Eq. (108) takes the form 

dvM = _ei ( L o 

° dt dz 


Next, let us consider the contour shown with the red dashed line in Fig. 19 (which shows a cross- 
section of the transmission line by a plane containing the wave propagation axis z), and apply to it the 
Faraday induction law (6.3). Since the electric field is zero inside the conductors (in Fig. 19, on the 
horizontal parts of the contour), the total e.m.f. equals the difference of voltages V at the end of the 
segment dz, while the only source of the magnetic flux through the area limited by the contour are the 
(equal and opposite) currents ±1 in the conductors, we can use Eq. (5.70) to express it. As a result, 
canceling dz in both parts of the equation, we get 


dl(z,t ) _ dV(z,t ) 
dt dz 


(7.111) 


39 The alternative is to have a virtually plane wave, which propagates along the transmission line conductors, and 
whose fields are just slightly deformed in their vicinity. Such a wave cannot be “guided” by the conductors, and 
hardly deserves the name of a “wave in the waveguide”. 
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where Lq is the mutual inductance of the conductors per unit length. The only difference between Lo and 
the dc mutual inductances discussed in Chapter 5 is that at the high frequencies we are analyzing now, 
C 0 should be calculated neglecting its penetration into the conductors. (In the dc case, we had the same 
situation for superconductor electrodes, within their crude, ideal-diamagnetic description.) 


I(z,t ) 


T , . dl . 

l(z,t) H dz 

dz 



V(z,t) 


V (z, t ) + dz 

dz 


<r 


dz 


-> 


Fig. 7.19. Electric current, magnetic flux, and 
voltage in a two-conductor transmission line. 


The system of Eqs. (110) and (111) is frequently called the telegrapher’s equations. Combined, 
they give for any “global” variable / (either V, or /, or X) a ID wave equation, 

d 2 f d 2 f 

-L-L 0 C 0 - f = 0, (7.112) 

dz dt 

which describes the dispersion-free TEM wave propagation. Again, this equation is only valid within the 
frequency range where the frequency dependence of both s and // is negligible. If it is not so, the global 
approach may still be used for sinusoidal waves /= Re[/)„cxp { i(kz - cot)}}. Repeating the above 
arguments, instead of Eqs. (11 0)-( 1 1 1) we get algebraic equations 

(DC 0 V w =kI co , aL^=kV mt (7.113) 

in which L 0 oc p and C 0 cc s may now depend on frequency. 

Two linear equations (113) are consistent only if 

LoCo 

(7.114) product 
invariance 

Besides the fact we have already known (that the TEM wave speed is the same as that of the plane 
wave), Eq. (114) gives us a result that I confess I have not emphasized enough in Chapter 5: the product 
L 0 Cq does not depend on the shape or size of line’s cross-section (provided that the magnetic field 
penetration into the conductors is negligible). Hence, if we have calculated the mutual capacitance Co of 
a system of two cylindrical conductors, the result immediately gives us their mutual inductance: L 0 = 
sp/Co. This relation stems from the fact that both the electric and magnetic fields may be expressed via 
the solution of a 2D Laplace equation for system’s cross-section. 

With Eq. (114) satisfied, any of Eqs. (113) gives the same result for ratio 

Transmission 

(7.115) line’s TEM 
Impedance 


that is called the transmission line’s impedance. This parameter has the same dimensionality (in SI 
units, ohms) as the wave impedance (7), 
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Z s 



(mT 


(7.116) 


but these parameters should not be confused, because Z w depends on cross-section’s geometry, while Z 
does not. In particular, Zw is the only important parameter of a transmission line for matching with a 
lumped load circuit (Fig. 20) in the important case when both the cable cross-section’s size and the 
load’s linear dimensions are much smaller than the wavelength. (The ability of TEM lines to have such a 
small cross-section is their another important advantage.) Indeed, in this case we may consider the load 
in the quasistatic limit and write 

V O) (z 0 ) = Z L (co)I O) (z 0 ) , (7.117) 


where Z/(co) is the (generally complex) impedance of the load. Taking V(z,t) and I(z,t) in the form 
similar to Eqs. (61) and (62), and writing two Kirchhoff s laws for point z = zq, we get for the reflection 
coefficient a result similar to Eq. (68): 

R = Z L (aj)-Z w ' ( 7118) 

Z L (co) + Z w 

This formula shows that for the perfect matching (i.e. the total wave absorption in the load), load’s 
impedance Z/( co) should be real and equal to Zw - but not necessarily to Z. 




Fig. 7.20. Transmission line 
impedance matching. 


As an example, let us consider one of the simplest (and the most important) transmission lines: 
the coaxial cable (Fig. 21). 40 



Fig. 7. 21. Cross-section of a coaxial cable with 
arbitrary (possibly, dispersive) dielectric filling. 


For this geometry, we already know expressions for both L 0 and Co, though they have to be 
modified for the dielectric constant and the magnetic field non-penetration into the conductors. After 
that modification, 


40 The coaxial cable was first patented by O. Heaviside in 1880. 
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2 ns 

In (b / a) ’ 


— In (b / a) . 
2 rc 


So, the universal relation (114) is indeed valid! For cable’s impedance (115), Eqs. (1 19) yield 

^ _ f MT In ibid) _ 7 In {bid) ^ 7 
\£ ) 2 n 2 n 


(7.119) 


(7.120) 


For standard TV antenna cables (such as RG-6/U, with b/a ~ 3, s/sq ~ 2.2), Z w = 75 ohms, while 
for most computer component connections, cables with Z w = 50 ohms (such as RG-58/U) are prescribed 
by electronic engineering standards. Such cables are broadly used for transfer of electromagnetic waves 
with frequencies (limited mostly by cable attenuation; see Sec. 10 below) up to 1 GHz over distances of 
a few km, and up to ~20 GHz on the tabletop scale (a few meters). 

Another important example of TEM transmission lines is the set of two parallel wires. In the 
fonn of twisted pairs, 41 they allow communications, in particular long-range telephone and DSL Internet 
connections, at frequencies up to a few hundred kHz, as well as relatively short Ethernet and TV cables 
at frequencies up to ~ 1 GHz, limited mostly by the mutual interference and parasitic radiation effects. 


7.7. H and E waves in metallic waveguides 


Let us now return to Eqs. (100) and explore the TE and TM waves - with, respectively, either H z 
or E z different from zero. At the first sight, they may seem more complex. However, equations (101), 
which determine the distribution of these longitudinal components over the cross-section, are just 2D 
Helmholtz equations for scalar functions. For simple cross-section geometries may be solved using the 
methods discussed for the Laplace equation in Chapter 2, in particular the variable separation. After the 
solution of such an equation has been found, the transverse components of the fields may be calculated 
by differentiation, using the simple fonnulas, 


E, =- T [i.V,£. -*Z(n,xV,.ff,)l H - — 
k t k, 


ky t H z + —(n z t E z ) , (7.121) 


which follow from the two equations in the first line of Eqs. (100). 42 


In comparison with the electro- and magnetostatics problems, the only conceptually new feature 
of Eqs. (101), with appropriate boundary conditions, is that they form the so-called eigenproblems, with 
typically many solutions ( eigenfunctions ), each describing a specific wave mode, and corresponding to a 
specific eigenvalue of parameter k h . The good news here is that these values of k, are determined by 
this 2D boundary problem and hence do not depend on k z . As a result, the dispersion law oik : ) of each 
mode, that follows from the last form of Eq. (102), 


<x> = 

(k 2 :+ k^ 

1/2 

( 2 , 2 , 2 V /2 

= (v k r + co c ) , 


l y ) 



(7.122) 


41 The twisting reduces mutual induction (“crosstalk”) between the lines, and parasitic radiation at their bends. 

42 For that, one of these two linear equations should be first vector-multiplied by n . Note that this approach could 
not be used to analyze TEM waves, because for them k, = 0, E- = 0 ,H Z = 0, and Eqs. (121) yield uncertainty. 


Coaxial 
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is functionally the same as that of plane waves in a plasma (see Eq. (38), Fig. 6, and their discussion), 
with the only differences that c is now replaced with v = l/(£ju) , the speed of plane (or any TEM) 
waves in the medium fdling the waveguide, and co p is replaced with the so-called cutoff frequency 

co c =vk t , (7.123) 


specific for each mode. (As Eq. (101) implies, and as we will see from several examples below, k t has 
the order of Ha, where a is the characteristic dimension of waveguide’s cross-section, so that the critical 
value of the free-space wavelength is of the order of a.) Below the cutoff frequency of each particular 
mode, it cannot propagate in the waveguide. 43 As a result, modes with the lowest values of a> c present 
special practical interest, because the choice of the signal frequency co between two lowest values of 
cutoff frequency guarantees that the waves propagate in the form of only one mode, with the lowest k t . 
Such a choice allows to simplify the excitation of the desired mode by wave generators, and to avoid the 
parasitic transfer of electromagnetic wave energy to undesirable modes by (unavoidable) small 
inhomogeneities of the system. 

The boundary conditions for the Helmholtz equations (101) depend on the propagating wave 
type. For TM waves (i.e. E modes, with H z = 0 but E : ^ 0), in the macroscopic approximation the 
boundary condition E T = 0 immediately gives 

E.\ c = 0, (7.124) 


where C is the contour limiting the conducting wall’s cross-section. For TE waves (the //modes, with E z 
= 0 but H z ^ 0), the boundary condition is slightly less obvious and may be obtained using, for example, 
the second equation of system (100), vector-multiplied by n z . Indeed, for the component perpendicular 
to the conductor surface the equation gives 

*,( H,),-4( n < xE >).=i!r- < 7 - 125 ) 

Z on 


But the first term in the left-hand part of this equation must be zero on the wall surface, because of the 
second of Eqs. (103), while according to the first of Eqs. (103), vector E, in the second tenn cannot have 
a component tangential to the wall. As a result, the vector product in that tenn cannot have a normal 
component, so that the term should equal zero as well, and Eq. (125) is reduced to 


dH_ , 

c 


= 0 . 


(7.126) 


Let us see what does this approach give for a simple but practically important example of a 
metallic-wall waveguide with a rectangular cross-section. In this case it is natural to use the Cartesian 
coordinates shown in Fig. 22, so that both Eqs. (101) take the simple form 


43 An interesting recent twist in the ideas of electromagnetic metamaterials (mentioned in Sec. 5 above) is the so- 
called e-near-zero materials, designed to have the effective product s/j much lower than e 0 /Jo within certain 
frequency ranges. Since at these frequencies the speed v (4) becomes much lower than c, the cutoff frequency 
(123) virtually vanishes. As a result, waves may “tunnel” through very narrow sections of metallic waveguides 
filled with such materials - see, e.g., M. Silveirinha and N. Engheta, Phys. Rev. Lett. 97, 157403 (2006). 
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"a 2 | d 2 

K dx 2 + ay 2 


A 



J/ = 0, / = 



for TM waves, 
for TE waves. 


(7.127) 


From Chapter 2 we know that the most effective way of solution of such equations in a 
rectangular region is the variable separation, in which the general solution is represented as a sum of 
partial solutions of the type 

f = X(x)Y(y). (7.128) 


Plugging this expression into Eq. (127), and dividing each term by XY, we get the equation, 


1 d 2 X 1 d 2 Y , 2 

T- + T + K 

X dx Y dy 


= 0 , 


(7.129) 


that should be satisfied for all values of x and y within the waveguide’s interior. This is only possible if 
each term of the sum equals a constant. Taking theX-term and 7-term constants in the fonn (—k x 2 ) and (- 
k y ), respectfully, and solving the corresponding ordinary differential equations, 44 for eigenfunction 
(128) we get 

/ = (c x cos k x x + s x sin k x x\c y cos k y y + s y sin k v v ) , with k 2 + k 2 = k 2 , (7.130) 


where constants c and 5 should be found from the boundary conditions. Here the difference between the 
H modes and E modes pitches in. 



Fig. 7.22. Rectangular waveguide, and the 
transverse field distribution in the basic 
mode H\o (schematically). 


For the former modes (TE waves), Eq. (130) is valid for H z , and we should use condition (126) 
on all metallic walls of the waveguide (x = 0 and a; y = 0 and b - see Fig. 22). As a result, we get very 
simple expressions for eigenfunctions and eigenvalues: 



TT nnx nmy 

= H, cos cos — — , 

1 a b 


k 


x 


7rn 

a 





= 7l\ 


( V 
' n ' 


+ 


\a) 


( \ 2 

m 




n 1 / 2 


(7.131) 

(7.132) 


44 Let me hope that the solution of equations of the type d 1 X / dx 1 + k 2 X = 0 does not present a problem for 
the reader, due to his or her prior experience with problems such as standing waves on a guitar string, 
wavefiinctions in a flat ID quantum well, or (with the replacement x — > t) a classical harmonic oscillator. 
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where Hi is the longitudinal field amplitude, and n and m are two arbitrary integer numbers, besides that 
they cannot equal to zero simultaneously. (Otherwise, function H z (x,y) would be constant, so that, 
according to Eq. (121), the transverse components of the electric and magnetic field would equal zero. 
As a result, as the last two lines of Eqs. (100) show, the whole field would be zero for any k z ^ 0.) 
Assuming, for certainty, that a > b ( as shown in Fig. 22), we see that the lowest eigenvalue of k h and 
hence the lowest cutoff frequency (123), is achieved for the so-called H \ 0 mode with n = 1 and m = 0, 
and hence 

Basic 
mode’s 
cutoff 

(thus confirming our prior estimate of k t ). 

Depending on the alb ratio, the second lowest k, and cutoff frequency belong to either the H n 
mode with n = 1 and m = 1 : 


(*,)io=- 

a 


(7.133) 


(^)n = n 


\a 


1 _L A 

2 + b 2 


1/2 


1 + 






-i 1/2 


(*,) 


10 ’ 


(7.134) 


or to the H 20 mode with n = 2 and m = 0: 

(k t ) 20 = — = 2(k t ) 10 - (7.135) 

a 

These values become equal at alb = V3 « 1.7; in practical waveguides, the alb ratio is not too far from 
this value. For example, in the standard X-band waveguide WR90 with a ~ 2.3 cm (f c = cojln ~ 6.5 
GFIz), b ~ 1.0 cm. 

Now let us have a fast look at alternative TM waves (E modes). For them, we may still should 
use the general solution (130) with /= E z , but now with boundary condition (124). This gives us 
eigenfunctions 



. nnx . nmy 

= E, sin sin — — , 

'a b 


(7.136) 


and the same eigenvalue spectrum (132) as for the H modes. However, now neither n nor m can be equal 
to zero; otherwise Eq. (136) would give the trivial solution E z {x,y) = 0. Hence the lowest cutoff 
frequency of TM waves is provided by the so-called E n mode with n =1, m = 1, and the eigenvalue is 
again given by Eq. (134). 

Thus the basic (or “fundamental”) H\ 0 mode is certainly the most important wave in rectangular 
waveguides; let us have a better look at its field distribution. Plugging the corresponding solution (131) 
with n = 1 and m = 0 into the general Eqs. (121), we easily get 

sin—, (HJ W = 0, (7.137) 

n a 

(E x \ 0 = 0, (E y ) w =i — ZH,s in — . (7.138) 

n a 

This field distribution is (schematically) shown in Fig. 22. Neither of the fields depends on the vertical 
coordinate - which is very convenient, in particular, for microwave experiments with small samples. 
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The electric field has only one (vertical) component that vanishes at the side walls and reaches 
maximum at waveguide’s center; its field lines are straight, starting and ending on wall surface charges 
(whose distribution propagates along the waveguide together with the wave). In contrast, the magnetic 
field has two nonvanishing components ( H x and H z ), and its field lines are shaped as horizontal loops 
wrapped around the electric field maxima. 

An important question is whether the Hio wave may be usefully characterized by a unique 
impedance introduced similar to Zw of the TEM modes - see Eq. (115). The answer is not, because the 
main value of Z w is a convenient description of the impedance matching of the transmission line with a 
lumped load - see Fig. 20 and Eq. (118). As was discussed above, such simple description is possible 
(i.e., does not depend on the exact geometry of the connection) only if both dimensions of line’s cross- 
section are much less than A. But for the H\ 0 wave (and more generally, any non- TEM mode) this is 
impossible - see, e.g., Eq. (129): its lowest frequency corresponds to the TEM wavelength Tmax = 
2 7r! {kt)mia = 2;r/(Ayho = 2 a. 45 


Now let us consider metallic waveguides with round cross-section (Fig. 23a). In this single- 
connected geometry, again, the TEM waves are impossible, while for the analysis of H modes and E 
modes the polar coordinates {p,(p} are most natural. In these coordinates, the 2D Helmholtz equation 
(101) takes the form 


1 a 


a 


p dp\P dp 


1 d 2 


■ + k 


p 2 dcp 2 

Separating the variables as /= E{p)f{(p), we get 


/ = 0 , / = 


H, 


for TM waves, 
for TE waves. 


(7.139) 
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pK dp 


P~ 


dK 
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dp ) p~? dxp 1 


(7.140) 


(a) 




Fig. 7.23. (a) Metallic and (b) dielectric 
waveguides with circular cross-sections. 


But this is exactly the Eq. (2.127) that was studied in the context of electrostatics, just with a 
replacement of notation: y — > k t . So we already know that in order to have 2 ^-periodic functions /f(p), 
and finite values ^0) (which are necessary for our current case - see Fig. 23a), the general solution is 


45 The reader is encouraged to find a simple interpretation of this equality. 
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given by Eq. (2.136), i.e. the eigenfunctions may be expressed via integer-order Bessel functions of the 
first kind: 46 

f nm = const X J n ( k nm p)e m(p , (7. 141) 

with eigenvalues k nm of the transverse wave number k t to be detennined from appropriate boundary 
conditions. 

As for the rectangular waveguide, let us start from H modes if = H z ). Then the boundary 
condition on the wall surface ( p = R ) is given by Eq. (126), which, for solution (141), takes the form 

A.© = 0, 4 = kR. (7.142) 

dg 

This means that eigenvalues of Eq. (139) are 

(7.143) 

where ig’ nm is the rn h root of function dJ n (g)/dg. The approximate values of these roots for several 
lowest n and m may be read out from the plots in Fig. 2.16; their more accurate values are presented in 
Table 1 below. 


Table 7.1. Roots <g’ nm of function (/../„( C)/dc for a few 
values of Bessel function’s index n and root’s number m. 



m = 1 

2 

3 

n = 0 

3.83171 

7.015587 

10.1735 

1 

1.84118 

5.33144 

8.53632 

2 

3.05424 

6.70613 

9.96947 

3 

4.20119 

8.01524 

11.34592 


It shows, in particular, that the lowest of the roots is fn ~ 1.84. Thus, a bit counter-intuitively, 
the basic mode, providing the lowest cutoff frequency co c = vk nm , is H n corresponding to n = 1 rather 
than n = 0: 47 


( , n\ 

V K ) 


J<P 


(7.144) 


with the transverse wave vector k, = kn = g’\ \/R ~ 1.84/7?, and hence the cutoff frequency corresponding 
to the TEM wavelength T max = 2 n/kn ~ 3.41 R. Thus the ratio of 2 max to the waveguide diameter 2 R is 


46 In Chapter 2, it was natural to take the angular dependence in the sin-cos form, which is equivalent to adding a 
similar term with n — > -n to the right-hand part of Eq. (141). However, since the functions / we are discussing 
now are already complex, it is easier to do calculations in the exponential form - though it is vital to restore real 
fields before calculating any of their nonlinear forms, e.g., the wave power. 

47 The lowest root of Eq. (142) with n = 0, i.e. /foo, equals 0, and would yield k= 0 and hence a constant field H z , 
which, according to the first of Eqs. (121), would give vanishing electric field. 
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about 1.7, i.e. is close to the ratio Am, Ja = 2 for the rectangular waveguide. The origin of this proximity 
is clear from Fig. 24, which shows the transverse field distribution in the Hu mode. (It may be readily 
calculated from Eqs. (121) with/E = 0 and H z given by the real part of Eq. (144).) 



Fig. 7.24. Transverse field components in the 
basic H x x mode of a metallic, circular waveguide 
(schematically). 


One can see that the field structure is actually very similar to that of the basic mode in the 
rectangular waveguide, shown in Fig. 22, despite the different nomenclature (due to the different type of 
used coordinates). However, note the arbitrary argument of complex constant/// in Eq. (144), indicating 
that in circular waveguides the transverse field polarization is arbitrary. For some practical applications, 
the degeneracy of these “quasi-linearly-polarized” waves creates problems; they may be avoided by 
using waves with circular polarization. 48 

As Table 1 shows, the next lowest H mode is H 2 \, for which k, = k 2 1 = %’u/R ~ 3.05 /R, almost 
twice larger than that of the basic mode, and only then comes the first mode with no angular dependence 
of the any field, Hoi, with k t = koi = g’oi /R ~ 3.83 /R. 49 

For the E modes, we may still use Eq. (141) (with /= E z ), but with boundary condition (124) at p 
= R. This gives the following equation for the problem eigenvalues: 

J,M„,R) = 0, i-e. *„=%-, (7.145) 

where % nm is the m-th root of function J„( g) - see Table 2.E The table shows that the lowest k, equals to 
£oi /R ~ 2.405 /R. Hence the corresponding mode (Zfoi), with 

=£,./„(£„,£), (7.146) 

K 

has the second lowest cutoff frequency, approximately 30% higher than that of the basic mode Hu- 

Finally, let us discuss one more topic of general importance - the number N of electromagnetic 
modes that may propagate in a waveguide within a certain range of relatively large frequencies co » co c . 
This is easy to calculate for a rectangular waveguide, with its simple expressions (132) for the 
eigenvalues of {k x , k y ). Indeed, these expressions describe a rectangular mesh on the [k x , k y \ plane, so 


48 Actually, Eq. (144) does describe a circularly polarized wave, while the real and imaginary parts of this 
expression describing two mutually perpendicular quasi-linearly-polarized waves. 

49 Electric field lines in the H 0 i mode (as well as all higher H 0m modes) are directed straight from the axis to the 
walls, reminding those of TEM waves in the coaxial cable. Due to this property, these modes provide, at co» co c , 
much lower power losses (see Sec. 10 below) than the fundamental H n mode, and are sometimes used in practice, 
despite all inconveniences of working in the multimode frequency range. 
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that each point corresponds to the plane area A Ak = ( nla){nlb ), and the number of modes in a large k- 
plane area Ak » A Ak is A = A/ ( /AA/ ( = abAd 7f = AA/jf, where A is the waveguide’s cross-section area. 50 
However, it is frequently more convenient to discuss transverse wave vectors k, of arbitrary direction, 
i.e. with arbitrary sign their components k x and k y . Taking into account that the opposite values of each 
component actually give the same wave, the actual number of different modes of each type {E or H) is a 
factor of 4 lower than was calculated above. This means that the number of modes of both types is 

N = 2^L. (7.147) 

(2 nf 

It may be convincingly argued that this mode counting rule is valid for waveguides with cross- 
section of any shape, and any boundary conditions on the walls, provided that N» 


7.8. Dielectric waveguides and optical fibers 

Now let us discuss electromagnetic wave propagation in dielectric waveguides. The simplest, 
step-index waveguide (Figs. 23, 25) consists of an inner core and an outer shell (in the optical fiber 
technology, called cladding) with a higher wave propagation speed, i.e. lower index of refraction: 

v + >v_, i.e.k + <k_, s + p + <s_p_. (7.148) 

(In most cases the difference is achieved due to that in the dielectric constant, e. < <s+, while magnetically 
both materials are almost passive: p.~p+~ //o, and I will assume that in my narrative.) The idea of the 
waveguide operation may be readily understood in the case when wavelength A is much smaller than the 
characteristic size R of core’s cross-section. If this “geometric optics” limit, at the distances of the order 
of A from the core -to-cladding interface, which detennines the wave reflection, we can consider the 
interface as a plane. As we know from Sec. 5, if angle #of plane wave incidence on such an interface is 
larger than the critical value 0 C specified by Eq. (82), the wave is totally reflected. As a result, the waves 
launched into the fiber core at such “grazing” angles, propagate inside the core, repeatedly reflected 
from the cladding - see Fig. 25. 



Fig. 7.25. Wave propagation 
in a thick optical fiber. 


The most important type of dielectric waveguides are optical fibers. 51 Due to a heroic 
technological effort, in about three decades starting from the mid-1960s, the attenuation of glass fibers 


50 This formula ignores the fact that, according to our analysis, some modes (with n = 0 and m = 0 for H modes, 
and n = 0 or m = 0 for E modes, are forbidden. However, for N » 1, the associated corrections of Eq. (91) are 
negligible. 

51 For a comprehensive description of this vital technology see, e.g., A. Yariv and P. Yeh, Photonics, 6 th ed., 
Oxford U. Press, 2007. 
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has been decreased from the values of the order of 20 db/km (typical for the window glass) to the 
fantastically low values about 0.2 db/km (meaning a virtually perfect transparency of 10-km-long fiber 
segments!) - see Fig. 26a. It is remarkable that this ultralow power loss may be combined with an 
extremely low frequency dispersion, especially for near-infrared waves (Fig. 26b). In conjunction with 
the development of inexpensive erbium-based quantum amplifiers, this breakthrough has enabled inter- 
continental (undersea), broadband 52 optical cables, which are the backbone of all the modern 
telecommunication infrastructure. The only bad news is that these breakthroughs were achieved for just 
one kind of materials (silica-based glasses) 53 within a very narrow range of their chemical composition. 
As a result, the dielectric constants s±/sq of the cladding and core of practical optical fibers are both 
close to 2.2 (n± « 1.5) and are very close to each other, so that the relative difference of the refraction 
indices, 


n —n. s —e. 

a = ~ — 

n 2 a ± 

is typically below 0.5%, thus limiting the fiber bandwidth - see below. 


(7.149) 
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Fig. 7.26. (a) Attenuation and (b) dispersion of representative single-mode optical fibers. 
(Adapted, respectively, from http://olson-technology.com and http://www.timbercon.com .) 


Practical optical fibers come in two flavors: multi-mode and single-mode ones. Multi-mode 
fibers, used for transfer of high optical power (up to as much as ~10 watts), have relatively thick cores, 
with a diameter 2 R of the order of 50 jum, much larger than X ~ 1 pm. In this case, the “geometric- 
optics” picture of the wave propagation discussed above is quantitatively correct, and we may use it to 
calculate the number of quasi-plane-wave modes that may propagate in the fiber. Indeed, for the 
complementary angle (Fig. 25) 


52 Each frequency band shown in Fig. 26a, at a typical signal-to-noise ratio SIN > 10 5 (50 db), corresponds to the 
Shannon bandwidth Af\og 2 (S/N) exceeding 10 14 bits per second, five orders of magnitude (!) higher than that of a 
modem Ethernet cable. And this is only per one fiber; an optical cable may have hundreds of them. 

53 The silica-based fibers were suggested in 1966 by C. Kao (the 2009 Nobel Prize in physics), but the very idea 
of using optical fibers for communications may be traced back to at least the 1963 work by J. Nishizawa. 
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Number 
of modes 


Eq. (82) gives the propagation condition 


<9 = 


n 


2 


e. 


cos <9 > — = 1- A . 
n 


(7.150) 


(7.151) 


For the case A « 1, when the incidence angles 6 > 6, of all propagating waves are close to n!2, and 
hence the complimentary angles are small, we can keep only two first terms in the Taylor expansion of 
the left-hand part of Eq. (151) and get 

<9max ~ 2A . (7.152) 


Even for the higher-end value A = 0.005, this critical angle is only ~0.1 radian, i.e. close to 5°. Due to 
this smallness, we can approximate the maximum transverse component of the wave vector as 

(*,)_ » V2M, (7.153) 


and use Eq. (147) to calculate number N of propagating modes: 


N .. 2 (x R2 X*k 2& lx) 

{2nf 


= (kR) 1 A . 


For typical values k = 0.73xl0 7 m' 1 (corresponding to the free-space wavelength Ao 
pm), R = 25 pm, and A = 0.005, this formula gives N~ 150. 


(7.154) 
nA = 2 milk ~ 1.3 


The largest problem with using multi-mode fibers for communications is their high geometric 
dispersion, i.e. the difference of the mode propagation speed, which is usually characterized in terms of 
the signal delay time difference (traditionally measured in picoseconds per kilometer) between the 
fastest and the slowest mode. Within the geometric optics approximation, the difference of time delays 
of the fastest mode (with k z = k) and the slowest mode (with k z = k sin#) at distance / is 



r n 


(k A 

<1 

II 

<1 

= A 





l ® J 



— (l - sin 9 C ) 


1 -- 


= —A . 
v 


(7.155) 


8 

For the example considered above, the TEM wave speed v = c/n « 2x10 m/s, and the geometric 
dispersion At/I is close to 25 ps/m, i.e. 25,000 ps/km. (This means, for example, that a 1-ns pulse, being 
distributed between the modes, would spread to a ~25-ns pulse after passing a just 1-km fiber segment.) 
Such disastrous dispersion should be compared with chromatic dispersion that is due to the frequency 
dependence of s±, and has the steepness ( dt/dA)H of the order of 10 ps/km-nm (see the solid pink line in 
Fig. 26b). One can see that through the whole frequency band ( dA — 1 00 nm) the total chromatic 
dispersion dt/l is of the order of only 1,000 ps/km. 

Due to the large geometric dispersion, the multimode fibers are used for signal transfer over only 
short distances (~ 100 m), while long-range communications are based on single-mode fibers, with thin 
cores (typically with diameters 2R ~ 5 pm, i. e. of the order of A! A 1/2 ). For such structures, Eq. (154) 
yields N ~ 1, but in this case the geometric optics approximation is not quantitatively valid, and we 
should get back to the Maxwell equations. In particular, this analysis should take into an explicit 
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account the evanescent wave propagating in the cladding, because its penetration depth may be 
comparable with R. 54 

Since the cross-section of an optical fiber is not uniform and lacks metallic conductors, the 
Maxwell equations cannot be exactly satisfied with either a TEM, or a TE, or a TM solutions. Instead, 
the fibers can carry so-called HE and EH modes, with both fields having longitudinal components 
simultaneously. In such modes, both E , and H z inside the core (p < R ) have the form similar to Eq. 
(141): 

/_ = f l J n {k t p)e incp , with k 2 = k 2 -k 2 „ >0, k 2 = co 2 £_p_, (7.156) 


where amplitudes f (i.e., Ei and //() may be complex to account for the possible angular shift between 
these components. On the other hand, for the evanescent wave in the cladding, we may rewrite Eq. (102) 
as 

(v 2 -/r, 2 )/ + = 0, with K 2 =k:-kl> 0, k 2 + =co 2 £ + p + (7.157) 


Figure 27 illustrates the relation between k t , K t , k z , and k+; note that the following sum, 


k 2 +k 2 =co 2 {s_ -£ + )p 0 , 


(7.158) 


2 2 

is fixed (at fixed frequency) and, for typical fibers, very small (~2 Ak~ « k ). By the way, Fig. 27 shows 

1/2 1/2 

that neither of k t and k, can be larger than co[{s. - £+)po\ = kA . In particular, this means that the depth 

5= \t K t of wave penetration into the cladding is at least 1/M " = 2/2 nA » 2/2 /r. This is why the 
cladding layers in practical optical fibers are made as thick as ~50 pm, so that only a negligibly small 
tail of this evanescent wave field reaches their outer surfaces. 


k 2 -k\ = (o 2 (<£•_ - s + )ju Q 



*, 2 

/ N 

t 

h 2 



V / 

\ 


> 


k 2 + k 2 z k 2 k 2 


Fig. 7.27. Relation between the transverse 
exponents k, and k, for waves in optical fibers. 


In the polar coordinates, Eq. (157) becomes 

r 0 \ 


\__d_ 

pdp 


d 

P t- 

. 


+ ■ 


1 8 2 
p 2 dcp 2 


-K. 


f + = 0, 


(7.159) 


instead of Eq. (139). From Sec. 2.5 we kn ow that the eigenfunctions of Eq. (159) are the products of the 
angular factor exp { in cp} by a linear combination of the modified Bessel functions /„ and K n , shown in 


54 I believe that the following calculation is important - both for practice, and as a good example of Maxwell 
theory application. However, its results will not be used in the following sections/chapters of the course, so that if 
the reader is not interested in this topic, he or she may safely jump to the beginning Sec. 9. 


Universal 
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Fig. 2.20, now of argument K t p. In our case, the fields should vanish at p — > oo, so that only the latter 
functions (of the second kind) can participate: 

f,«zK n {K t p)j n(p (7.160) 


Now we have to reconcile Eqs. (156) and (160), using the boundary conditions at p = R for both 
longitudinal and transverse components of both fields, with the latter fields first calculated from using 
Eqs. (121). Such a conceptually simple, but a bit bulky calculation (which I am leaving for reader’s 
exercise :-), yields a system of two linear, homogeneous equations for complex amplitudes Ei and Hi, 
that are compatible if 


r k 2 j; | kl kA 

i j: , i Kp 

n 2 

r k 2 k 2 + ) 

_i_ + 

f i o 

U J n K , K n J 

U J n K t K n ) 

<N 

OS 

1 

k 2 K 2 

\ K t K i J 

k 2 + k 2 

K t J 


(7.161) 


where prime means the derivative of each function over its full argument: k,p for J n , and K,p for K n . 

For any given frequency co, the system of Eqs. (158) and (161) determines the values of k, and k u 
and hence k z . Actually, for any n > 0, this system provides two different solutions: one corresponding to 
the so-called HE wave with larger ratio E Z IH Z , and the EH wave, with a smaller value of that ratio. For 
angular-symmetric modes with n = 0 (for whom we might naively expect the lowest cutoff frequency), 
the equations may be satisfied by fields having just one finite longitudinal component (either E z or H z ), 
and the HE modes are the usual E waves, while the EH modes are the H waves. For the H modes, the 
characteristic equation is reduced to the requirement that the second parentheses in the left-hand part of 
Eq. (161) equals to zero. Using the identities J’o = - J\ and K’ 0 = - K\, this equation may be rewritten as 


1 Jj (k t R) __ 1 K x {k,K) 
k, J 0 (k,R) K t K 0 (k,R)' 


Using the simple relation between k, and k, given by Eq. (158), we may plot both parts of Eq. 
(162) as a function of the same argument, say, c = k,R - see Fig. 28. 



Fig. 7.28. Two sides of the characteristic 
equation (162), plotted as a function of k t R, 
for two values of its dimensionless 
parameter: V = 8 (blue line) and V = 3 (red 
line). Note that according to Eq. (158), the 
argument of functions K 0 and K l is just 
k,R = [V 2 - ( k,R) 2 ] m = (V 2 - g) m . 


The right-hand part of Eq. (162) depends not only on q but also on the dimensionless parameter 
V defined as the normalized right-hand part of Eq. (158): 
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V 2 =co 2 (s -s + )p 0 R 2 «2A k;R 2 . (7.163) 

(According to Eq. (155), if V » I , it gives the doubled number of the fiber modes - the conclusion 
confirmed by Fig. 28, taking into account that it describes only the H modes.) Since the ratio KJKq is 
positive for all values of their argument (see, e.g., the right panel of Fig. 2.20), the right-hand part of Eq. 
(162) is always negative, so that the equation may have solutions only in the intervals where the ratio 
M is negative, i.e. at 

£oi<M<£n, £02 <M<£l2.-, (7.164) 


where % nm is the m-th zero of function J„(c) - see Table 2.1. The right-hand part of the characteristic 
equation diverges at kR — » 0, i.e. at k,R — » V, so that no solutions are possible if V is below the critical 
value V c = go i ~ 2.405. At this cutoff point, Eq. (163) yields k+.» |//?(2 A) 1 2 . Flence, the cutoff 

frequency for the lowest H mode corresponds to the TEM wavelength 


2 


max 


^-(2A) 1/2 « 3.7RA 1 ' 2 . 


(7.165) 


For typical parameters A = 0.005 and R = 2.5 pm, this result yields /lm ax ~ 0.65 pm, corresponding to the 
free-space wavelength A 0 ~ 1 pm. A similar analysis of the first parentheses in the left-hand part of Eq. 
(161) shows that at A — > 0, the cutoff frequency for the E modes is similar. 

This situation may look exactly like that in metallic waveguides, with no waves possible at 
frequencies below co c , but this is not so. The basic reason for the difference is that in metallic 
waveguides, the approach to co c results in the divergence of the longitudinal wavelength A z = 2n!k z . On 
the contrary, in dielectric waveguides this approach leaves A z finite (k z — » k+). Due to this difference, a 
certain linear superposition of HE and EH modes with n = 1 can propagate at frequencies well below the 
cutoff frequency for n = 0, which we have just calculated. 55 This mode, in the limit s+ ~ a. (i.e. A « 1) 
allows a very interesting and simple description using the Cartesian (rather than polar) components of 
the fields, but still expressed as functions of polar coordinates p and (p. The reason is that this mode is 
very close to a linearly polarized TEM wave. (Due to this reason, this mode is referred to as LPo\.) 


Let us select axis x parallel to the transverse component of the magnetic field vector, so that 
E x \p=o = 0, but E y \p = o ^ 0, and H x \ p 0 ^0, but H y \ p o = 0. The only suitable solutions of the 2D Helmholtz 
equation (that should be obeyed not only by z-components of the field, but also their x- and y- 
components) are proportional to Jo(k,p), with zero coefficients for E x and H y : 


E x = 0, 


Ey ~ E 0 J 0 (k,p), 


H x =H 0 J 0 (k t p), 


H y = 0 , 


for p < R . 


(7.166) 


Now we can readily calculate the longitudinal components, using the last two equations of Eqs. (100): 


E 


z 


1 dE y 
~ ik z dy 


= -i~r L E 0 J l (k t p)sm<p, 
k. 


H, 


~^ dl ^ L = -i l ^Ef 0 J l (k l p) C os<p, (7.167) 
- ik : ox k z 


where I have used mathematical identities J’o = - J\, dp/dx = x/p = cos (p, and dp/dy = yip = sirup. As a 
sanity check, we see that the longitudinal component or each field is a (legitimate!) eigenfunction of the 


55 This fact becomes less surprising if we recall that in the circular metallic waveguide, discussed in Sec. 7, the 
lowest mode (H u , Fig. 23) also corresponded to n = I rather than n = 0. 


LPoi mode’s 
fields 

distribution 
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type (141) with n = 1. Note also that if k t « k z (this relation is always true if A « 1 — see Fig. 27), the 
longitudinal components of the fields are much smaller than their transverse counterparts, so that the 
wave is indeed very close to the TEM one. Because of that, the ratio of the electric and magnetic field 
amplitudes is also close to that in the TEM wave: Eq/Hq « Z_ « Z+. 

Now in order to ensure the continuity of the fields at the core-to-cladding interface (p = R ), we 
need to have a similar angular dependence of these components at p > R. The longitudinal components 
of the fields are tangential to the interface and thus should be continuous. Using the solutions similar to 
Eq. (160) with n = 1, we get 

k, JAk.R) k, JAk.R) 

E z = ~i — „ . D , E 0 K x (ic t p) sm <p, H z =-i— H 0 K X (jc t p) cos <p, for p > R. (7.168) 

k z K x (ic t R) k z K x (k,R) 

For the transverse components, we should require the continuity of the normal magnetic field pH, h for 
our simple field structure equal to just pH x cos(p, of the tangential electric field E z = E y svcup, and of the 
normal component of D„ = sE n = eEpostp. Assuming that p. = p+ = juo, and s+ ~ s . 56 we can satisfy these 
conditions with the following solutions 

E,= 0, E y = J Ej^ E„K„(k, P \ H x = J pEEH a K,(K,p), H,= 0, for p>R. (7.169) 
K 0 (rc,p) K 0 (k t p ) 


From here, we can calculate components from E z and H z , using the same approach as for p < R\ 

1 dE v mK j (k t R) 


E, = 


H x = 


-ik. 

1 


dy 

dH r 




- ik. dx 




K K 0 (K t R) 
K t J 0 (k t R) 


k z K 0 {k,R) 


E 0 K x (ic t p)sm(p, 

H 0 K x (K t p) cos (p, for p>R. 


(7.170) 


We see that this equation provides the same functional dependence of the fields as Eqs. (166), i.e. the 
internal and external fields are compatible, but their amplitudes coincide only if 

LP m mode’s 
characteristic 
equation 

This characteristic equation (which may be also derived from Eq. (161) with n = 1 in the limit 
A — > 0) looks close to Eq. (162), but functionally is much different from it - see Fig. 29. Indeed, its right- 
hand part is always positive, and the left-hand part tends to zero at k,R — > 0. Due to this, Eq. (171) may 
have a solution for arbitrary small values of parameter V, defined by Eq. (159), i.e. for arbitrary low 
frequencies . This is why this mode is used in practical single-mode fibers: there are no other modes that 
can propagate at co< co c , so that the geometric dispersion problem is avoided. 

It is easy to use the Bessel function approximations given by the first term of the expansion 
(2.132) and also Eq. (2.157) to show that in the limit V —> 0 (i.e. V « 1), k,R tends to zero much faster 



(7.171) 


56 This is the core assumption of this approximate theory which accounts only for the most important effect of the 
difference of dielectric constants £ and s:. the opposite signs of the differences (k+ - k z ~) = k, and (k. - kp) = - 
Kf . For more discussion of accuracy of this approximation and some exact results, let me refer the interested 
reader either to the monograph by A. Snyder and D. Love, Optical Waveguide Theory, Chapman and Hill, 1983, 
or to Chapter 3 and Appendix B in the monograph by Yariv and Yeh, which was cited above. 
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than k,R « V: K t R — > 2exp{-l/F} « V. This means that the scale p c = I / « t of the radial distribution of the 
LPoi wave’s fields in the cladding becomes very large. In this limit, this mode may be interpreted as a 
virtually TEM wave propagating in the cladding, just slightly defonned (and guided) by the fiber core. 
The drawback of this feature is that it requires very thick cladding, in order to avoid energy losses in 
outer (“buffer” and “jacket”) layers that defend the silica components from the elements, but lack their 
low optical absorption. Due to this reason, the core radius is usually selected so that parameter V is just 
slightly less than the critical value V c = fin « 2.4 for higher modes, thus ensuring the single-mode 
operation and eliminating the geometric dispersion problem. 



Fig. 7.29. Two sides of the 
characteristic equation ( 1 67) for the 
LP oi mode, plotted as a function of 
kR, for two values of the 
dimensionless parameter: V = 8 
(blue line) and V= 1 (red line). 


In order to reduce the field spread into the cladding, the step-index fibers considered above may 
be replaced with graded-index fibers whose the dielectric constant s r is gradually and slowly decreased 
from the center to the periphery. Keeping only the main two terms in the Taylor expansion of the 
function s(p) at p = 0, we may approximate such reduction as 


e(p)«e(0) I- 7 P 2 


(7.172) 


2 2 

where = - \(d sldp)ls\p = 0 is a positive constant characterizing the fiber composition gradient. 57 
Moreover, if this constant is sufficiently small « k ), the field distribution across the fiber’s cross- 
section may be described by the same 2D Helmholtz equation, but with the space-dependent transverse 
wave vector: 58 


[ v i +k t(p)\f = 0, where kf(p) = k\p) -k] = co 2 s(p)p 0 -k] = Arf (0)f 1 -^p 1 '] . (7.173) 

V 2 


Surprisingly for such axially- symmetric problem, because of its special dependence on the radius, this 
equation may be most readily solved in Cartesian coordinates. Indeed, rewriting it as 


57 For an axially-symmetric fiber with a smooth function dp), the first derivative ds/dp should vanish at p = 0. 

58 Such approach is invalid at arbitrary (large) £ Indeed, in the macroscopic Maxwell equations, a(r) is under the 
differentiation sign, and the exact Helmholtz-type equations for fields have additional terms containing Vs. 
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a 


^2 C 2 


X- + Y— + km l-^x 2 -^v 

Sx 2 fy 2 ' \ 2 2 • 


and separating variables as f=X(x)Y(y), we get 

d 2 X d 2 Y " 


■ + ■ 


f ,2 <T .2 


+ G (0) l-4-x -y-y =0, 


dX 1 dY 2 2 2' 

so that functions X and Y obey the same similar differential equation. 


d 2 f 

dx 2 


+ k v 


l-^x 2 


/ = 0, / = 


with the separation constants satisfying the following relation: 


k 2 + k 2 = k 2 (0) = <d 2 £(0)/j (i - k 2 . 


/ = 0, 

(7.174) 

\ 

2 =0, 

J 

n, 

(7.175) 

[x, 

{?■ 

(7.176) 


(7.177) 


Equation (176) is well known from the elementary quantum mechanics, because the Schrodinger 
equation for the perhaps most important quantum system, a ID harmonic oscillator, may be rewritten in 
this form. Their eigenvalues are described by a simple formula 


(*A = 


r g \ V2 

V2 j 


( 2/1 + 1 ), 


(*,)„ = 


f g \ V2 

V2 j 


(2/n + l), n,m = 0,1,2,... 


(7.178) 


but eigenfunctions X„(x ) and Y m (y) have to be expressed via not quite elementary functions - the Hermite 
polynomials. 59 For our purposes, however, the lowest eigenfunctions Xo(x) and To(y) are sufficient, 
because they correspond to the lowest k x _ y and hence the lowest cutoff frequency: 


= (k 2 x ) 0 + (K)o = S ■ 


(7.179) 


(Note that at C,— > 0, the cutoff frequency tends to zero, as it should be for a wave in a uniform medium.) 
The eigenfunctions corresponding to the lowest eigenvalues are simple: 


/ 0 (x) = const x exp< - 




(7.180) 


so that the field distribution follows the Gaussian (“bell curve”) function 


/o(/ 7 )=/o(°) ex P1- 


f(x +y ) 


= /o(0)exp(- 




(7.181) 


This is the so-called Gaussian beam, very convenient for some applications. Still, the graded-index 
fibers have higher attenuation than their step-index counterparts, and are not used as broadly. 

Speaking of the Gaussian beams (or more generally, any beams with axially-symmetric profile 
/o(/7)), I cannot help noticing the very curious option of forming so-called helical waves with complex 
amplitude fo(p)exp {il <p} , where / is an integer constant, and (p is the azimuthal angle (so that in our 
notation x = pcoscp, y = psirup). Let me leave it for reader’s exercise to prove that the electromagnetic 


59 See, e.g., QM Sec. 2.6. 
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field of such a wave has an angular momentum vector L = L : n z , with L z proportional to /. 60 Quantization 
of the helical field gives L z = Iti per photon. The case / = ±1 is possible for infinite -width beams (i.e. 
plane waves) and means their circular polarization, quantum-mechanically corresponding to spin ±1 of 
their photons - see the discussion in the end of Sec. 1. In contrast, the implementation of higher values 
of |/| requires space-limited beams (with / 0 — » 0 at p — > go) and may be interpreted as giving the wave an 
additional “orbital” angular momentum. 61 


7.9. Resonators 

Resonators are the distributed oscillators, i.e. structures that may sustain standing waves (in 
electrodynamics, oscillations of the electric and magnetic field at each point) even without a source, 
until the oscillation amplitude slowly decreases in time due to unavoidable energy losses. If the 
resonator quality (described by the so-called Q-factor, which will be defined and discussed in the next 
section) is high, this decay takes many oscillation periods. Alternatively, high-Q resonators may sustain 
oscillating fields pennanently, if fed with a relatively weak incident wave. 

Conceptually the simplest resonator is the Fabiy-Perot interferometer 62 that may be obtained by 
placing two well-conducting planes parallel to each other. 63 Indeed, in Sec. 1 we have seen that if a 
plane wave is normally incident on such a “perfect mirror”, located at z = 0, its reflection, at negligible 
skin depth, results in a standing wave described by Eq. (61) - that may be rewritten as 

E(z, t ) = R e[lE a e~ i0}t+i7rl2 )sin kz . (7.1 82) 

Hence the wave would not change if we had suddenly put the second mirror (isolating the segment of 
length I from the external wave source) at any position z = / with sin kl = 0, i.e. 

kI = pK, where p = 1,2,.... (7.183) 


This condition, which also detennines the eigen- (or resonance) frequency spectrum of the resonator of 
fixed length /, 


(Op = vk p 




(7.184) 


60 This task should be easier after reviewing results of field’s momentum analysis in Sec. 9.8, in particular Eqs. 
(9.235) and (9.237). 

61 Theoretically, the possibility of separating of the angular momentum of an electromagnetic wave to the “spin” 
and “orbital” parts may be traced back to at least the 1 943 work by J. Humblet; however, this issue had not been 
discussed in literature too much until the spectacular 1992 experiments by L. Allen et al. who demonstrated a 
simple way of generating such helical optical beams. (For reviews of this and later work see, e.g., G. Molina- 
Terriza et al.. Nature Physics 3, 305 (2007) and/or L. Marrucchi et al., J. Opt. 13, 064001 (2011), and references 
therein.) Presently there are efforts to use this approach for so-called “orbital angular moment (OAM) 
multiplexing” of waves for high-rate information transmission - see, e.g., J. Wang et al.. Nature Photonics 6, 488 
( 2012 ). 

62 The device is named after its inventors, M. Fabry and A. Perot; and is also called the Fabryi-Pbrot etalon 
(meaning “gauge”), because of its initial usage for the light wavelength measurement. 

63 The resonators formed by well conducting (usually, metallic) walls are frequently called the resonant cavities. 
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has a simple physical sense: the resonator length / equals exactly p half-waves of frequency co p . Though 
this is all very simple, please note a considerable change of philosophy from what we have been doing 
in the previous sections: the main task in resonator analysis is finding its eigenfrequencies co p that are 
now detennined by the system geometry rather than by an external wave source. 

Before we move to more complex resonators, let us use Eq. (62) to present the magnetic field in 
the Fabry-Perot interferometer: 


H(z,t) = Re 


E a c -ia>t 


cos kz 


J 


(7.185) 


Expressions (182) and (185) show that in contrast to traveling waves, each field of the standing wave 
changes simultaneously (proportionately) at all points of the Fabry-Perot resonator, turning to zero 
everywhere twice a period. At those instants the electric field energy of the resonator vanishes, but the 
total energy stays constant, because the magnetic field oscillates (also simultaneously at all points) with 
the phase shift nil. Such behavior is typical for all electromagnetic resonators. 

Another, more technical remark is that we can readily get the same results ( 1 82)-( 1 85) by 
solving the Maxwell equations from the scratch. For example, we already know that in the absence of 
dispersion, losses, and sources, they are reduced to wave equations (3) for any field components. For the 
Fabry-Perot resonator’s analysis, we can use their ID fonn, say, for the transverse component of the 
electric field: 


V 

v & 2 


1 8 2 " 
V~dt* , 


E = 0, 


(7.186) 


and solve it as a part of an eigenvalue problem with the corresponding boundary conditions. Indeed, 
separating time and space variables as E(z, t ) = Z(z)7{t), we get 


1 d 2 Z 11 d 2 T _ Q 

Z dz 2 v 2 T dt 2 

9 

Calling the separation constant k , we get two similar ordinary differential equations, 


d 2 Z 
dz 2 


+ k 2 Z 


= 0, 


d 2 r 
dt 2 


+ k 2 v 2 T = 0, 


(7.187) 


(7.188) 

(7.189) 


both with sinusoidal solutions, so that their product is a standing wave with a wave vector k and 
frequency co = kv, which may be presented by Eq. (182). 64 Now using the boundary conditions E( 0, t) = 
E (/, t) = 0, 65 we get the eigenvalue spectrum for k p and hence for co p = vk p , given by Eqs. (183) and 
(184). 


64 In this form, the equations are valid even in the presence of dispersion, but with the frequency-dependent wave 
speed: v 2 = \! dco)p(co). 

65 This is of course the expression of the first of the general boundary conditions (104). The second if these 
conditions (for the magnetic field) is satisfied automatically for the transverse waves we are considering. 
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Lessons from this simple case study may be readily generalized for an arbitrary resonator: there 
are (at least :-) two methods of finding the eigenfrequency spectrum: 

(i) We may look at a traveling wave solution and find where reflecting mirrors may be inserted 
without affecting the wave’s structure. Unfortunately, this method is limited to simple geometries. 


(ii) We may solve the general 3D wave equation, 


f 

V 

v 


2 


1 a2 .l 


/(r,0 = 0 , 


(7.190) 


for field components, as an eigenvalue problem with appropriate boundary conditions. If system 
parameters (and hence coefficient v) do not change in time, the spatial and temporal variables of Eq. 
(185) may be always separated by taking 

= (7.191) 


where function 7[t ) always obeys the same equation (189), having the sinusoidal solution of frequency co 
= vk. Plugging this solution back into Eq. (190), for the spatial distribution of the field we get the 3D 
Helmholtz equation, 


(v 2 +k 2 )?( r) = 0, 

whose solution (for non-symmetric geometries) may be much more complex. 


(7.192) 


Let us use these methods to find the eigenfrequency spectrum of a few simple, but practically 
important resonators. First of all, the first method is completely sufficient for the analysis of any 
resonator formed as a fragment of a uniform TEM transmission line (e.g., a coaxial cable) between two 
conducting lids perpendicular to the line direction. Indeed, since in such lines k : = k = colv, and the 
electric field is perpendicular to the propagation axis, e.g., parallel to the lid surface, the boundary 
conditions are exactly the same as in the Fabry-Perot resonator, and we again arrive at the 
eigenfrequency spectrum (184). 


Now let us analyze a slightly more complex system: a rectangular metallic-wall cavity of volume 
axbxl - see Fig. 30. In order to use the first method, let us consider the resonator as a finite-length (Az = 
I ) of the rectangular waveguide stretched along axis z, which was analyzed in detail in Sec. 7. As a 
reminder, for a < h, in the basic /7m traveling wave mode, both E and H do not depend on y, with vector 
E having only v-component. On the contrary, vector H has both components H x and H~, with the phase 
shift /r/2 between them, with component H x having the same phase as E v - see Eqs. (131), (137), and 
(138). Hence, if a plane, perpendicular to axis z, is placed so that the electric field vanishes on it, H x also 
vanishes, so that all the boundary conditions (104) pertinent to a perfect metallic wall are fulfilled 
simultaneously. 



Fig. 7.30. Rectangular metallic resonator as a 
finite section of a waveguide with the cross- 
section shown in Fig. 25. 


3D 

Helmholtz 

equation 
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As a result, the H\ o wave would not be perturbed by two metallic walls separated by an integer 
number of half- wavelength 272 corresponding to the wave number given by Eqs. (102) and (133): 




2 A 


n 



(7.193) 


Basic 

mode's 

frequency 


Using this expression, we see that the smallest of these distances, / = 272 = ji 1k z , gives resonance 
frequency 66 


( n 


+ 


\a) 


( 7T^ 2 


v 1 y 


il/2 


(7.194) 


with the indices showing the number of half-waves along each dimension of the system. This is the 
lowest (fundamental) eigenfrequency of the resonator (if b< a, l). 


The field distribution in this mode is close to that in the corresponding waveguide mode H l0 
(Fig. 22), with the important difference that phases of the magnetic and electric fields are shifted by 
phase jd2 both in space and time, just as in the Fabry-Perot resonator - see Eqs. (182) and (185). Such 
time shift allows for a very simple interpretation of the H m mode that is especially adequate for very 
flat resonators, with b « a , /. At the instant when the electric field reaches maximum (Fig. 31a), i.e. the 
magnetic field vanishes in the whole volume, the surface electric charge of the walls (with density <j = 
EJs) is largest, being localized mostly in the middle of the broadest (in Fig. 31, horizontal) faces of the 
resonator. At later times, the walls start to recharge via surface currents whose density J is largest in the 
side walls, and reaches its maximal value in a quarter period of the oscillation period of frequency a>m - 
see Fig. 31b. The currents generate the vortex magnetic field, with looped field lines in the plane of the 
broadest face. The surface currents continue to flow in this direction until (in one more quarter period) 
the broader walls of the resonator are fully recharged in the polarity opposite to that shown in Fig. 31a. 
After that, the surface currents stat to flow in the direction opposite to that shown in Fig. 31b. This 
process, that repeats again and again, is conceptually similar to the well- kn own oscillations in a lumped 
LC circuit, with the role of (now, distributed) capacitance played mostly by the broadest faces of the 
resonator, and that of distributed inductance, mostly by its narrower walls. 


(a) 



(b) 



Fig. 7.31. Fields, charges, and 
currents in the basic H m mode of a 
rectangular metallic resonator, at two 
instants separated by At = nt2o)\m - 
schematically. 


In order to generalize result (194) to higher oscillation modes, the second method discussed 
above is more prudent. Separating variables as 2fr) = X(x)Y(y)Z(z) in the Flelmholtz equation (192), we 


66 In most electrical engineering handbooks, the index corresponding to the shortest side of the resonator is listed 
last, so that the fundamental mode is nominated as H u 0 and its eigenfrequency as ®iio- 
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see that X, Y, and Z have to be sinusoidal functions of their arguments, with wave vector components 
satisfying the characteristic equation 


kt+k : +k, = r = 


t o 


(7.195) 


In contrast to the wave propagation problem, now we are dealing with standing waves along all three 
dimensions, and have to satisfy the boundary conditions on all sets of parallel walls. It is straightforward 
to check that the macroscopic boundary conditions (. E r = 0, H n = 0) are fulfilled at the following field 
component distribution: 

E x = E l cos k x x sin k y sin k z z, H x = H x sin k x x cos k y y cos k z z, 

E y = E 2 sin k x x cos k y sin k,z, H = H 2 cos k x x sin k y cos k z z, (7.196) 

E, = E 2 sin k x x sin k y cos k,z, H z = H 3 cos k x x cos k v y sin k z z. 


with each of the wave vector components having the equidistant spectrum similar to the one given by 
Eq. (193): 


, k n , Km 

k r = , k v = . 

a b 


k_ = 


K p 

~ T ' 


so that the full spectrum of eigenfrequencies is given by the following formula, 


=vk = v 


— 1 1 / 2 


k n 


V a 


+ 


Km 


+ 


K p 


(7.197) 


(7.198) 


which is a natural generalization of Eq. (194). Note, however, that of 3 integers m, n, and p at least two 
have to be different from zero, in order to keep the fields (196) nonvanishing. 

3 

Let us use Eq. (199) to evaluate the number of different modes in a relatively small region d k 
«k 3 (which is still much larger than the reciprocal volume, \IV = l/abl, of the resonator) of the wave 
vector space. Taking into account that each eigenfrequency (198), with nml ^ 0, corresponds to two field 
modes with different polarizations, 67 the argumentation absolutely similar to the one used in the end of 
Sec. 7 for the 2D case yields 


dN = 2V 


d 3 k 
(2k) 3 ' 


(7.199) 


This property, valid for resonators of arbitrary shape, is broadly used in classical and quantum statistical 
physics, 68 in the following form. If some electromagnetic mode property, _/(k), is a smooth function of 
the wave vector, and volume V is large enough, then Eq. (199) may be used to approximate the sum over 
the modes by an integral: 


67 This fact becomes evident from plugging Eq. (196) into the Maxwell equation V-E = 0. The resulting equation, 
k x E\ + k y E 2 + k-E 2 =0, with the discrete, equidistant spectrum (197) for each wave vector component, may be 
satisfied by two linearly independent sets of constants ^ lj2j3 . 

68 See, e.g., QM Sec. 1.1 and SM Sec. 2.6. 
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X/(k) *J/(k )dN = J /(k )^~d 3 k = 2-f v jf(k)d 3 k . (7.200) 

k N k a k \2- n ) k 

Finally, note that low-loss resonators may be also formed by finite-length sections of not only 
metallic waveguides with different cross-sections, but also of the dielectric waveguides. Moreover, even 
the a simple slab of a dielectric material with a /jJs ratio substantially different from that of its 
environment (say, the free space) may be used as a high -Q Fabry-Perot interferometer (Fig. 32), due to 
an effective wave reflection from its surfaces at normal and especially inclined incidence - see, 
respectively, Eqs. (68) and Eqs. (91) and (95). 



Fig. 7.32. Dielectric Fabry-Perot interferometer. 


Actually, such dielectric Fabry-Perot interferometer is frequently more convenient for practical 
purposes than a metallic resonator, due to its natural coupling to environment, that enables a ready way 
of wave insertion and extraction. The back side of the same medal is that this coupling to environment 
provides an additional mechanism of power losses, limiting the resonance quality - see the next section. 


7.10. Energy loss effects 

Inevitable energy losses (“power dissipation”) in passive media lead, in two different situations, 
to two different effects. In a long transmission line fed by a constant wave source at one end, the losses 
lead to a gradual attenuation of the wave, i.e. to the decrease of its amplitude, and hence power 'A, with 
the distance z along the line. In linear materials, the losses are proportional to the wave amplitude 
squared, i.e. to the time- average of the power itself, so that the energy balance on a small segment dz 
takes the form 


dr = - 


d'P, 


loss 


dz 


dz = -a'Pdz . 


Coefficient a, participating in the last form of Eq. (201) and defined by relation 


Q _ ^Jdz 

P 


(7.201) 


(7.202) 


is called the attenuation constant. 69 Comparing the evident solution of Eq. (201), 


69 In engineering, attenuation is frequently measured in decibels per meter (acronymed as db/m or just dbm): 
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db/m 


= 10 log 
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'P(z) = 7>( 0)e~ az , 

with Eq. (29), where k is replaced with k z , we see that a may expressed as 

a = 21m k z , 


(7.203) 


Wave 

attenuation 


(7.204) 


where k z is the component of the wave vector along the transmission line. In the most important limit 
when the losses are low in the sense a « \ k z | « Re k z , its effects on the field distributions along the 
line’s cross-section are negligible, making the calculation of a rather straightforward. In particular, in 
this limit the contributions to attenuation from two major sources, energy losses in the filling dielectric, 
and the skin effect in conducting walls, are independent and additive. 

The dielectric losses are especially simple to describe. Indeed, a review of our calculations in 
Secs. 6-8 shows that all of them remain valid if either s(co), or /u{co), or both, and hence k(co), have small 
imaginary parts: 

k" = col^s V2 (co)iu V2 {co)\<< k ' . (7.205) 

Energy 
loss 
in filling 
dielectric 


In waveguides with non-TEM waves, we can readily use the relations between k z and k derived 
above to re-calculate k” into Im k : . (Note that as such re-calculation, values of k t stay real, because they 
are just the eigenvalues of the Helmholtz equation (101), which does not include k.). 

In waveguides and transmission lines with metallic conductors, much higher energy losses may 
come from the skin effect. Let us calculate them, assuming that we know the field distribution in the 
wave, in particular, the tangential component H of the magnetic field at conductor surface. Then, if the 
wavelength A is much larger than S s , as it usually is, 70 we may use the results of the quasistatic 
approximation derived in Sec. 6.2, in particular Eqs. (6.27)-(6.28) for the relation between the complex 
amplitudes of the current density in the conductor and the tangential magnetic field 

j.(x)=k_H.(x\ S)= — ■ (7.207) 

O s /J.COO 


In TEM transmission lines, k = k z , and hence Eq. (205) yields 


a 


dielectric 


2k" = 2«Im[^ 1/2 ((o)jU V 2 (o)]. 


(7.206) 


For dielectric waveguides, in particular optical fibers, these losses are the main attenuation mechanism. 
As we already kn ow from Sec. 8, in practical optical fibers tc t R » 1, i.e. most of the field propagates (as 
the evanescent wave) in the cladding, and the wave mode is very close to TEM. This is why it is 
sufficient to use Eq. (206) for the cladding material alone. 


The power loss density (per unit volume) may be now calculated by time averaging of Eq. (4.39): 


loss 


W= 


|7»W| 2 _ NKM 2 _ 


2<j 


2 u 


S:cj 


(7.208) 


70 As follows from Eq. (78), which may be used for estimates even in cases of arbitrary incidence, this condition 
is necessary for low attenuation: a«k only i f F « 1 . 
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and its integration along the normal to the surface (through all the skin depth), using the exponential law 
(6.26). This (elementary) integration yields the following power loss per unit area: 71 


(7.209) 


The total power loss dP\ 0 Jdz per unit length of a waveguide, i.e. the right-hand part of Eq. (201), now 
may be calculated by the integration of the ratio P\ 0 JA along the contour(s) limiting the cross-section of 
all conductors of the line. Since our calculation is only valid for low losses, we may ignore their effect 
on the field distribution, so that the unperturbed distribution may be used both in Eq. (209), i.e. the 
nominator of Eq. (202), and also for the calculation of the average propagating power, i.e. the 
denominator of Eq. (202), as the integral of the Poynting vector over the cross-section of the waveguide. 

Let us see how this approach works for the TEM mode in one of the simplest TEM transmission 
lines, the coaxial cable (Fig. 19). As we already kn ow from Sec. 6, in the absence of losses, the 
distribution of TEM mode fields is the same as in statics, namely: 


tnergy 
loss 
in metallic 
walls 



H : = 0, H = 0, H (p {p) = H,-, 

P 

where Ho is the field’s amplitude on the surface of the inner conductor, and 


(7.210) 


E z = 0, E p (p) = ZH^p) = ZH 0 ~, E f = 0 , Z = 

P 


P_ 


(7.211) 


Now we can, neglecting losses for now, use Eq. (42) to calculate the time-averaged Poynting vector 


g z|g»r gig. 


r \ 


P) 


(7.212) 


and from it, the total power propagating through the cross-section: 


P 


= | Sd 2 r = 


Z\HA a* 


2n \ 


Pdp _ry\ jj |2 ^2 * b 


= nZ\H<X In — . 


(7.213) 


a P 


For the particular case of the coaxial cable (Fig. 19), the contours limiting the wall cross-sections 
are circles of radii p = a (where the surface field amplitude / 1 JO) equals, in our notation, Ho), and p = b 
(where, according to Eq. (204), the field is a factor of b/a lower). As a result, for the power loss per unit 
length, Eq. (209) yields 
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(7.214) 


Note that at a « b, the losses in the inner conductor dominate, despite its smaller surface, because of 
the higher surface field. Now we may plug Eqs. (2 13)-(2 14) into the definition (202) of a, to calculate 
the part of the attenuation constant associated with the skin effect: 


71 For a normally-incident plane wave, this formula would bring us back to Eq. (78). 
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(7.215) 


We see that the relative (dimensionless) attenuation, alk, scales approximately as the ratio <S/m i n \a, b\. 
This result is should be compared with Eq. (78) for the normal incidence of plane waves on a conducting 
surface. 


Let us evaluate a for the standard TV cable RG-6/U (with copper conductors of diameters 2 a = 
1 mm, 2b = 4.7 mm, and s~ 2.2 £q, ju~ jUo). According to Eq. (6.27a), for /= 100 MHz (co~ 6.3xl0 8 s' 1 ) 
the skin depth of pure copper at room temperature (with a ~ 6.0xl0 7 S/m) is close to 6.5xl0' 6 m, while 
k = of s/u) = {s! So) (cole) « 3.1 m' . As a result, the attenuation is rather low: a s k,„ « 0.016 m' , so that 
the attenuation length scale / = 1 la is about 60 m. Hence the attenuation in a cable connecting a roof 
TV antenna to a TV set in the same house is not a big problem, though using a worse conductor, e.g., 
steel, would make the losses rather noticeable. (Hence the current worldwide shortage of copper.) 
However, an attempt to use the same cable in the X-band (f ~ 10 GHz) is more problematic. Indeed, 
though the skin depth c% oc co decreases with frequency, the wave length drops, i.e. k increases, even 
faster ( k oc co), so that the attenuation « s k m °c co 2 becomes close to 0.16 m, and / to ~6 m. This is why 
at such frequencies, it is more customary to use rectangular waveguides, with their larger internal 
dimensions a, b ~ 1 Ik, and hence lower attenuation. Let me leave the calculation of this attenuation, 
using Eq. (209) and the results derived in Sec. 9, for reader’s exercise. 

The power loss effect on free oscillations in resonators is different: there it leads to a gradual 
decay of oscillation energy £ in time. The useful measure of this decay, called the Q factor, may be 
introduced by writing the temporal analog of Eq. (201): 


dt = --p^dt = ~edt. 


(7.216) 


where co in the eigenfrequency in the loss-free limit, and the dimensional Q factor is defined by a 
relation parallel to Eq. (202): 72 


Q 8 


(7.217) 


The solution to Eq. (216), 


£(t) = £( 0)e~ tlT , with r = ^= Q/2k - QT 


co 


co / In In 


(7.218) 


which is an evident temporal analog of Eq. (203), shows the physical meaning of the Q factor: the 
characteristic time r of the oscillation energy decay is (i QHn ) times longer than the oscillation period T 
= 2n!co. (Another interpretation of Q comes from the relation 73 



(7.219) 


72 As losses grow, the oscillation waveform deviates from sinusoidal one, and the very notion of “oscillation 
frequency” becomes vague. As a result, parameter Q is well defined only if it is much higher than 1 . 

73 See, e.g., CM Sec. 4.1. 
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where A co is the so-called FWHM 74 bandwidth of the resonance, namely the difference between the two 
values of the external signal frequency, one above and one below co, at which the energy of forced 
oscillations induced in the resonator by an input signal is twice lower than its resonant value.) 


In the important particular case of resonators formed by insertion of metallic walls into a TEM 
transmission line of small cross-section (with the linear size scale a much less than the wavelength X), 
there is no need to calculate the Q factor directly if the line attenuation coefficient a is already known. 
In fact, as was discussed in Sec. 9 above, the standing waves in such a resonator, of the length given by 
Eq. (183): / = p(AJ2) with p = 1, 2,..., may be understood as an overlap of two TEM waves running in 
opposite directions, or in other words, a traveling wave and its reflection from one of the ends, the 
whole roundtrip taking time At = 2//v = p/J v = Ijipl oo = pT. According to Eq. (201), at this distance the 
wave’s power should drop by exp {-2 a/} = exp{-p«/l}. On the other hand, the same decay may be 
viewed as happening in time, and according to Eq. (216), result in the drop by exp { -At/ r) = exp{- 
(pT)/(Q/co)} = c\p{-2/ij)/Q} . Comparing these two exponents, we get 


Q vs. a 



k 

a 


(7.220) 


This simple relation neglects the losses at wave reflection from the walls limiting the resonator 
length. Such approximation is indeed legitimate at a « X; if this relation is violated, or if we are dealing 
with more complex resonator modes (such as those based on the reflection of E or H waves), the Q 
factor may be smaller than that given by Eq. (220), and needs to be calculated directly. A substantial 
relief for such a direct calculation is that, just at the calculation of small attenuation in waveguides, in 
the low-loss limit ( Q » 1), both the nominator and denominator of the right-hand part of Eq. (217) may 
be calculated neglecting the effects of the power loss on the field distribution in the resonator. I am 
leaving such a calculation, for the simplest (rectangular and circular) resonators, for reader’s exercise. 


To conclude this chapter, the last remark: in some resonators (including certain dielectric 
resonators and metallic resonators with holes in their walls), additional losses due to wave radiation into 
the environment are also possible. In some simple cases (say, the Fabry-Perot interferometer shown in 
Fig. 32) the calculation of these radiative losses is straightforward, but sometimes it requires more 
elaborated approaches, which will be discussed in the next chapter. 


7.11. Exercise problems 

7.1 . * Find the temporal Green’s function of a medium whose complex dielectric constant obeys 
Eq. (32), using: 

(i) the Fourier transform, and 

(ii) the direct solution of Eq. (30), which describes the corresponding model of the medium. 

Hint : For the Fourier transfonn, you may like to use the Cauchy integral. 75 

7.2 . The electric polarization of a material responds in the following way to an electric field 

step: 76 


74 This is the acronym for “Full Width at Flalf-Maximum”. 

75 See, e.g., MA Eq. (15.2). 
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P{t) = s x E 0 



where r is a positive constant. 


if E(t) = E 0 



for t < 0, 
for 0 < t, 


7.3 . Calculate the complex dielectric constant s(co) for a material whose dielectric-response 
Green’s function, defined by Eq. (23), is 

G(0)=G„( l-A /r ), 

with some positive constants Go and r. What is the difference between this dielectric response and the 
apparently similar one considered in the previous problem? 

7.4 . Use the Lorentz oscillator model of an atom, given by Eq. (30), to calculate the average 
potential energy of the atom in a unifonn, sinusoidal ac electric field, and use the result to calculate the 
potential profile created for the atom by a standing electromagnetic wave with the electric field 
amplitude EJr). Discuss the conditions of validity of your result. 

7.5 . The solution of the previous problem shows that a standing plane wave exerts a time- 
averaged force on a non-relativistic charged particle. Reveal the physics of this force by writing and 
solving the equations of motion of a free particle in: 

(i) a linearly-polarized, monochromatic, plane traveling wave, and 

(ii) a similar but standing wave. 

Discuss the conditions of validity of your result. 

7.6 . Calculate, sketch and discuss the dispersion relation for electromagnetic waves propagating 
in a Lorentz oscillator medium described by Eq. (32), for the case of negligible damping. 

7.7 . As was briefly discussed in Sec. 2, 77 a wave pulse of a finite but relatively large spatial 
extension A r » X = 2nlk may be represented with a wave packet - a sum of sinusoidal waves with wave 
vectors k within a relatively narrow interval. Consider an electromagnetic plane wave packet of this 
type, with the electric field distribution 

E(r,f) = Re C ° ki ^dk, w i th co k \s ( a> k )// ( 0 ) k )] ' 2 = |/r | , 

-oo 

propagating along axis z in an isotropic, linear, and loss-free (but not necessarily dispersion-free) 
medium. Express the full energy of the packet (per unit area of wave’s front) via complex amplitudes E*, 
and discuss its dependence of time. 

7.8 . Analyze the effect of a constant, uniform magnetic field Bo, parallel to the direction n of 
electromagnetic wave propagation, on the wave dispersion in plasma, within the same simple model that 


76 This function E(t) is of course proportional to the well-known step function 6 - see, e.g., MA Eq. (14.3). I am 
not using this notion just to avoid a possible confusion between two different uses of the Greek letter 6. 

77 And in more detail in CM Sec. 5.3, and especially in QM Sec. 2.1. 
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was used in the lecture notes for derivation of Eq. (7.38). (Limit your analysis to relatively weak waves, 
whose magnetic field is negligible in comparison with B 0 .) 

Hint : You may like to represent the incident wave as a linear superposition of two circularly 
polarized waves, with the left- and right-hand polarization. 

7.9 . A monochromatic, plane electromagnetic wave is normally incident from free space on a 
uniform slab of a material with electric permittivity s and magnetic penneability //, with the slab 
thickness d comparable with the wavelength. 

(i) Calculate the power transmission coefficient 7~, i.e. the fraction of the incident power, that is 
transmitted through the slab. 

(ii) Assuming that s and // are frequency-independent and positive, analyze in detail the 
frequency dependence of 7T In particular, how does function 7\ co) depend on the film thickness d and 

i j'j 

the wave impedance Z= (ju/s) “of its material? 

7.10 . A monochromatic, plane electromagnetic wave with free-space wave number ko is 
normally incident on a plane conducting film of thickness d ~ <5, « Mko. Calculate the power 
transmission coefficient of the system, i.e. the fraction of incident wave’s power propagating beyond the 
film. Analyze the result in the limits of small and large ratios d/S s . 


7.1 1 . A plane wave of frequency co is normally incident, from 
free space, on a plane surface of a material with real values of the 
electric pennittivity s’ and magnetic permeability //’. To minimize 
wave reflection from the surface, you may cover it with a layer, of 
thickness d, of another transparent material - see Fig. on the right. 
Calculate the optimal values of s, ju, and d. 


£ 0 ■> Mo 


> 



7.12 . A monochromatic, plane wave is incident from inside a 
medium with s/u > sq/Uq on its plane surface, at the angle of incidence 6 larger than the critical angle 0 C = 

1/9 

arcsin (sojuo/ sju) . Calculate the depth 5 of the evanescent wave penetration into the free space and 
analyze its dependence on 9. Does the result depend on the wave polarization? 


7.13 . Analyze the possibility of propagation of surface electromagnetic waves along a plane 
boundary between plasma and free space. In particular, calculate and analyze the dispersion relation of 
the waves. 

Hint : Assume that the magnetic field of the wave is parallel to the boundary and perpendicular to 
the wave propagation direction. (After solving the problem, justify this mode choice.) 


7.14 . Calculate the characteristic impedance Zw of the long, straight TEM transmission lines 
formed by metallic electrodes with cross-sections shown in Fig. below: 
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(i) 

d»R 

2R$Q) v 




(i) two round, parallel wires, separated by distance d » R, 

(ii) microstrip line of width w » d, 

(iii) stripline with w» d\~ A, 

in all cases using the macroscopic boundary conditions on metallic surfaces. Assume that the conductors 
are embedded into a linear dielectric with constant e and //. 


7.15 . Modify results of Problem 10(ii) for a superconductor microstrip line, taking into account 
the magnetic field penetration into both the strip and the ground plane. 

7.16 . * What lumped ac circuit would be equivalent to the system shown in Fig. 20, with incident 
wave’s power 'pp. Assume that the wave reflected from the load circuit does not return to it. 


7.17 . Find the lumped ac circuit equivalent to a loss-free 
TEM transmission line of length / ~ X, with a small cross-section 
area A « A , as “seen” (measured) from one end, if the line’s 
conductors are galvanically connected (“shortened”) at the other end 
- see Fig. on the right. Discuss result’s dependence on the signal 
frequency. 

7.18 . Represent the fundamental Hio wave in a rectangular waveguide (Fig. 22) with a sum of 
two plane waves, and discuss the physics behind such a representation . 

7.19 / For a metallic coaxial cable with the circular cross-section (Fig. 21), find the lowest non- 
TEM mode and calculate its cutoff frequency. 

7.20 . Two coaxial cable sections are connected coaxially - see 
Fig. on the right, which shows system’s cut along its symmetry axis. 

Relations (118) and (120) seem to imply that if the ratios b/a of these a 
sections are equal, their impedance matching is perfect, i.e. a TEM 
wave incident from one side on the connection would pass it without 
any reflection at all: R = 0. Is this statement correct? 

7.21 / Use the recipe outlined in Sec. 8 to prove the characteristic equation (161) for the HE and 
EH modes in a round, step-index optical fiber. 
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7.22 . Find the lowest eigenfrequencies, and corresponding oscillation 
modes, of a round cylindrical resonator (see Fig. on the right) with perfectly 
conducting walls. 



723. A plane, monochromatic wave propagates through a medium whose Ohmic conductance cr 
dominates the power losses, while the electric and magnetic polarization effects are negligible. Calculate 
the wave attenuation coefficient and relate the result with some calculation carried out in Chapter 6. 


7.24 . Generalize the telegrapher’s equations (llO)-(lll) by taking into account small energy 
losses in: 

(i) transmission line’s conductors, and 

(ii) the media separating the conductors, 

using their simplest (Ohmic) models. Formulate the conditions of validity of the resulting equations. 


7.25 . Calculate the skin-effect contribution to the attenuation coefficient a, defined by Eq. 
(202), for the basic (H l0 ) mode propagating in a waveguide with the rectangular cross-section - see Fig. 
22. Use the results to evaluate a and L for a 10 GHz wave in the standard X-band waveguide WR-90 
(with copper walls, a = 23 mm, b = 10 mm, and no dielectric filling), at room temperature. Compare the 
estimate with that, made in Sec. 10, for a standard coaxial cable, for the same frequency. 

1 26 * Calculate the skin-effect contribution to the attenuation coefficient a of 

(i) the basic ( H n ) mode, and 

(ii) the /An mode 

in a metallic waveguide with the circular cross-section (Fig. 23a), and analyze the low-frequency (<z> 
—>m b) and high-frequency ( co » a> c ) behaviors of a for each of these modes. 

7.27 . For a rectangular metallic-wall resonator with dimensions axbxl (b < a, /), calculate the Q- 
factor in the fundamental (lowest) oscillation mode, due to the skin-effect losses in the walls. Evaluate 
the factor (and the lowest eigenfrequency) for a 23x23x10 mm resonator with copper walls, at room 
temperature. 

z . 

7.28 . Calculate the lowest eigenfrequency and Q factor (due to the 
skin-effect losses) of the toroidal (axially-symmetric) resonator with 
metallic walls and interior’s cross-section shown in Fig. on the right, 
within the limit d«r, R. 



1 29 . Express the contribution to the damping coefficient (the reciprocal (9-factor) of a resonator, 
due to small energy losses in the dielectric that fills it, via dielectric’s complex functions s(co) and /u(cb) 
of the material. 
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7.30 . For the dielectric Fabry-Perot resonator (Fig. 32) with the normal wave incidence, find the 
(9-factor due to radiation losses in the limit of strong impedance mismatch (Z » Zo), using two 
methods: 

(i) from the energy balance, using Eq. (217), and 

(ii) from the frequency dependence of the power transmission coefficient, using Eq. (219). 
Compare the results. 


Chapter 7 


Page 66 of 66 





Essential Graduate Physics 


EM: Classical Electrodynamics 


Chapter 8. Radiation, Scattering, Interference, and Diffraction 

This chapter continues the discussion of the electromagnetic wave propagation, now focusing on the 
results of wave incidence on a passive object. Depending on the object’s shape, the result of this 
interaction is called either scattering, or diffraction, or interference. However, as we will see below, the 
boundary between these effects is blurry, and their mathematical description may be conveniently based 
on a single key calculation - the electric dipole radiation of a spherical wave by a small source. 
Naturally, I will start the chapter from this calculation, deriving it from an even more general result - 
the “retarded potentials ” solution of the Maxwell equations. 


8.1. Retarded potentials 

Let us start from the general solution of the Maxwell equations in a dispersion-free, linear, 
uniform, isotropic medium, characterized by frequency-independent, real e and // - for example, free 
space. 1 The easiest way to perform this calculation is to use the scalar (0) and vector (A) potentials of 
electromagnetic field, that are defined via the electric and magnetic fields by Eqs. (6.106): 

E = -N (j) , B = V x A . (8.1) 

dt 


As was discussed in Chapter 6, imposing upon the potentials the Lorenz gauge condition (6. 108), 


V-A + Tf = 0, v’.A, 
v dt sju 


( 8 . 2 ) 


(which does not affect fields E and B) the macroscopic Maxwell equations for the fields may be recast 
into a pair of very similar, simple equations (6.109) for the potentials: 


v dt £ 

(8.3a) 

Y7 2 A 1 <3 2 A 

VA W Sr 

(8.3b) 


Let us calculate the fields induced by the stand-alone electric charge and current densities p{ r, t) 
and j(r, t), thinking of them as known functions. 2 The idea how this may be done may be borrowed from 
electro- and magnetostatics. Indeed, for the stationary case (d/dt = 0), the solutions of Eqs. (8.3) are 
given, by the evident generalization of, respectively, Eq. (1.38) and by Eq. (5.28) to the uniform, linear 
medium: 


<Pf) = 




(8.4a) 


1 When necessary (e.g., at the discussion of the Chere nk ov radiation in Sec. 10.4), it will be not too hard to 
generalize these results to dispersive media. 

2 Such thinking would not prevent the results from being valid for the case when p( r, t) and j(r, t) should be 
calculated self-consistently. 
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A(r) s -Mj(r')AL (8.4b) 

4 7T J | r-r | 

As we know, these expressions may be derived by, first, calculating the potential of a point source, and 
then using the linear superposition principle for a system of such sources. 

Let us do the same for the time-dependent case, starting from the field induced by a time- 
dependent point charge at origin: 3 

P(r,t) = q(t)S(r), (8.5) 


In this case Eq. (3 a) is homogeneous everywhere but the origin: 

VV-i^i = °, atr*0. 

v 2 8t 2 


(8.6) 


Due to the spherical symmetry of the problem, it is natural to look for a spherically-symmetric solution 
to this equation. 4 Thus, we may simplify the Laplace operator 5 correspondingly, and reduce Eq. (6) to 


" i d 

( r 2 A 

1 8 2 ~ 

r 2 dr 

l dr) 

v 2 dt 2 _ 


at r^O. 


(8.7) 


If we now introduce a new variable x = r <j> , Eq. (7) is reduced to the ID wave equation 


f d 2 


i a 


2 A 


y dr 2 v 2 dt 2 j 


X = 0, at r * 0 . 


( 8 . 8 ) 


From the discussion in Chapter 7, 6 we know that its general solution may be presented as 

X( r ,t) = Z 0 



( r ^ 


r a 

% ow\ 

t-- 

+ Jin 

, t +— 


l vJ 


l v) 


(8.9) 


where X' m an d j ou t are (so far) arbitrary functions of one variable. The physical sense of (fw a = Xout Jr is a 
spherical wave propagating from our source (at r = 0) to outer space, i.e. exactly the solution we are 
looking for. On the other hand, (f) m = xJ r describes a spherical wave that could be created by some 
distant spherically-symmetric source, that converges on our charge located at the origin - evidently not 
the effect we want to consider here. Discarding this term, and returning to (j) = j/r , we can write the 
solution (7) as 


<t>(r,t) = ~X , 


1 f 

t-- 
V V J 


(8.10) 


3 Admittedly, this expression does not satisfy the continuity equation (4.5), but we will correct this deficiency 
imminently, at the linear superposition stage - see Eq. (17) below. 

4 Let me emphasize that this is not the general solution to Eq. (6). For example, it does nor describe the fields 
created by other sources, that pass by the considered charge q(t). However, such fields are irrelevant for our 
current task: to calculate the field created by the charge q(t) itself. 

5 See, e.g., MA Eq. (10.9). 

6 See also CM Sec. 5.3. 
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In order to find function % out , let us consider distances r so small that the time derivative in Eq. 
(3a), with the right-hand part (5), 


vV 


1 d 2 (j) 

2 a* 2 

V ot 



( 8 . 11 ) 


is much smaller that the spatial derivative (that diverges at r — » 0) . Then Eq. (11) is reduced to the 
electrostatic equation whose solution (4a), for source (5), is 

<f>(r —>0,t) = ■ (8-12) 

4 nsr 


Now requiring the two solutions, (10) and (12), to coincide at r « vt, we get j out (t) = q(t)/4nsr, so that 
Eq. (10) becomes 


<f>(r, t) 


4 nsr 


q 


r r \ 

t-- 


\ 


v) 


(8.13) 


Just as had been done in statics, this result may be readily generalized for the arbitrary position 
r ’ of the point charge: 

p{r,t) = q(t)8{ r -r') = q(t)S(R ) , (8.14) 

where R is the distance between the field observation point r and the source position point r ’, i.e. the 
length of the vector, 

R = r -r ' , (8.15) 

connecting these points - see Fig. 1. 



Obviously, Eq. (13) becomes 


</>(r,t) 


1 f R'] 

q t . 

4ns R ly v ) 


(8.16) 


Now we can use the linear superposition principle to write, for the arbitrary charge distribution p{ r, t). 


Retarded 

scalar 

potential 



(8.17a) 


where integration is extended over all charges of the system under analysis. Acting absolutely similarly, 
for the vector potential we get 
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Retarded 

(8.17b) vector 
potential 

(Now nothing prevents functions p{ r, t) and /( r, t) from satisfying the continuity relation.) 

Solutions (17) are called the retarded potentials, the name signifying that the observed fields are 
“retarded” (delayed) in time by At = R/v relative to the source variations, due to the finite speed v of the 
electromagnetic wave propagation. These solutions are so important that they deserve at least a couple 
of general remarks. 

First, remarkably, these simple expressions are exact solutions of the Maxwell equations (93) in 
a unifonn medium for an arbitrary distribution of stand-alone charges and currents. They also may be 
considered as the general solutions of these equations, provided that the integration is extended over all 
field sources in the Universe - or at least in its part that affects our observations. 

Second, if functions p(r, t ) and j(r, t) include the microscopic (bound) charges and currents as 
well, the macroscopic Maxwell equations (6.93) are valid with the replacement s — > So and // — » // 0 , so 
that the retarded potentials solutions (17) are also valid - with the same replacement. 

Finally, Eqs. (17) may be plugged into Eqs. (1), giving (after an explicit differentiation) the so- 
called Jefimenko equations for fields E and B - similar in structure to Eqs. (17), but more cumbersome. 
Conceptually, the existence of such equations is a good news, because they are free from the gauge 
ambiguity pertinent to potentials <j> and A. However, the practical value of these explicit expressions for 
the fields is not too high: for all applications I am aware of, it is easier to use Eqs. (17) to calculate the 
particular expressions for the potentials first, and only then calculate the fields from Eqs. (1). Let me 
present the (apparently most important) example of this approach. 


JL 

4 n 


r / , R ^ 

d 3 r’ 

J r d~ — 


J k vj 

R 


8.2. Electric dipole radiation 

Consider again the problem that was discussed in electrostatics (Sec. 3.1), namely the field of a 
localized source with linear dimensions a « r (Fig. 1), but now with time-dependent charge and/or 
current distribution. Using the arguments of that discussion, in particular the condition expressed by Eq. 
(3-1), r’ « r, we may apply the Taylor expansion (3.3), 

/(R) = /(r) -r' • V/(r) + ... , (8.18) 

to function /(R) = R (for which V/(r) = VR = n, where n = r/r is the unit vector directed toward the 
observation point, see Fig. 1) to approximate distance R as 

R&r- r'-n. (8.19) 

In each of the retarded potential formulas (17), R participates in two places: in the denominator 
and in the source time argument. If p and j change in time on scale ~1 Icq, where co is some characteristic 
frequency, then any change of argument (t - R/v) on that time scale, for example due to a change of R on 
the spatial scale ~v/co = 1 Ik, may substantially change these functions. Thus, expansion (18) may be 
applied to R in the argument ( t - R/v) only if ka « 1, i.e. if the system size a is much smaller than the 
radiation wavelength H = 2 n/k. On the other hand, function HR changes relatively slowly, and for it even 
the first tenn expansion (19) gives a good approximation as soon as a « r, R. In this approach, Eq. 
(17a) yields 
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1 

A nsr 



R 

vj 


dfi' 


1 

4 nsr 


Q t 

v 


R" 

1 

v J 


( 8 . 20 ) 


where Qfit ) is the net electric charge of the localized system Due to the charge conservation, this charge 
cannot change with time, so that the approximation (20) describes gives just a static Coulomb field of 
our localized source, rather than a radiated wave. 


Let us, however, apply a similar approximation to the vector potential (17b): 


A(r, t) 


B 

An r 


Ob'. 


t-- 


R 

vj 


r/V. 


( 8 . 21 ) 


According to Eq. (5.87), in statics the right-hand part of this expression would vanish, but in dynamics 
this is no longer true. For example, if the current is due to a nonrelativistic motion 7 of a system of 
charges q >, we can write 

J j(r',/)JV = = p(f), (8.22) 

k dt k 


where p(/) is the dipole moment of the localized system, defined by Eq. (3.6). Now, after the integration, 
we may keep only the first term of approximation (19) in the argument ( t - R/v ) as well, getting 


A(r,i) 


B 

An r 


r r \ 

t-- 


V 


vj 


(8.23) 


Far 

zone 

field 


Let us analyze what exactly does this result, valid in the limit ka «1, describe. The second of 
Eqs. (1) allows us to calculate the magnetic field by the spatial differentiation of A. At large distances r 
» X (i.e. in the so-called far field zone), where Eq. (23) describes a virtually plane wave, the main 
contribution into this derivative is given by the dipole moment factor: 


B(r,f)=/ V xp 
4 nr 

{ A 

t-- 

K v) 

B ■■( r ^ 

= c — nxp t — 

Anrv ^ v j 



(8.24) 


This expression means that the magnetic field, at the observation point, is perpendicular to vectors n and 
(the retarded value of) p , and its magnitude is 


B = 


Anrv 


( A 


1 .. 

f A 

t-- 

sin 0, 

i.e. H = p 

t-- 

V vj 


Anrv 

l v) 


sin <9, 


(8.25) 


where 6 is the angle between those two vectors - see Fig. 2. 8 


7 For relativistic particles, moving with velocities of the order of speed of light, one has to be more careful. As the 
result, I will postpone the discussion of their radiation until Chapter 10, i.e. until after the discussion of special 
relativity in Chapter 9. 

8 From the first of Eqs. (1), for the electric field, in the first approximation (23), we would get -dAJdt = -(1/Anevr) 
p (t - r/v) = -(Z/4nr) p (t - r/v). The transverse component of this vector (see Fig. 2) is the proper wave field E = 
ZHxn, while its longitudinal component is exactly compensated by (-V^) in the next term of expansion of Eq. 
(17a) with respect to small parameter r!X « 1. 
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The most important feature of this result is that the time-dependent field decreases very slowly 
(only as Hr) with the distance from the source, so that the radial component of the corresponding 
Poynting vector (7.7), S r = ZI I , drops as Hr , i.e. the full power P of the emitted spherical wave, that 
scales as r 2 S r , does not depend on the distance from the source - as it should for radiation. Equation (25) 
allows us to be more quantitative; for the instantaneous radiation intensity we may plug it into Eq. (7.9) 
to get 

Instant 
(8.26) Power 
density 



n 



This is the famous formula for the electric dipole radiation ; this is the dominating component of 
radiation by a localized system of charges - unless p = 0. Please notice its angular dependence: the 

radiation vanishes at the axis of the retarded vector p (where 0 = 0), and reaches its maximum in the 
plane perpendicular to that axis. Integration of S r over all directions, i.e. over the whole sphere of radius 
r, gives the total instant power of the dipole radiation: 9 

Full 

(8.27) instant 
power 

In order to find the average power, this expression has to be averaged over a sufficiently long 
time. In particular, if the source is monochromatic, p(/) = Refp^expl-zrrf}], with time-independent 
vector p a,, such averaging may be carried out just over one period, giving an extra factor 2 in the 
denominator: 

Full 

(8.28) average 
power 

The easiest example of application of the formula is to a point charge oscillating, with frequency 
co, along a straight line (that we may take for axis z), with amplitude a. In this case, p = qn-z(t) = qa Re 
[exp {-icot}], and if the charge velocity amplitude, aao, is much less than the wave speed v, we may use 
Eq. (28) with p (0 = qa, giving 




9 In the Gaussian units, for free space (v = c ), this important formula reads P = (2 /3c 3 ) p . It was first derived 
in 1897 by J. Larmor for the particular case of a single point charge q moving with acceleration r , when p = qr 
and hence P = (2 q 2 /3c 3 ) r 2 . As a result, Eq. (27) is sometimes referred to as the Larmor formula. 
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rj 2 2 4 

— Zq a co 

7 D = — y 

12 nv~ 


(8.29) 


Applied to an electron (q = -e ~ -1.6x1 O' 19 C), rotating about a nuclei at an atomic distance a ~ 10' 10 m, 
the Lannor formula shows 10 that the energy loss due to the dipole radiation is so large that it would 
cause electron’s collapse on atom’s nuclei in just ~10' 10 s. In the beginning of the 1900s, this classical 
result was one of the main arguments for the development of quantum mechanics that prevents such 
collapse of electrons in their lowest-energy (ground) state. 

Another example of a very useful application of Eq. (28) is the radio wave radiation by a short, 
straight, symmetric antenna which is fed, for example, by a TEM transmission line such as a coaxial 
cable - see Fig. 3. 



Z A 


+ 1/2 


I .( 0 ) 




h - 1/2 


Fig. 8.3. Dipole antenna. 


The exact solution of this problem is rather complex, because the law I 0 £z) of the current 
variation along antenna’s length should be calculated self-consistently with the distribution of the 
electromagnetic field that is induced by the current in the surrounding space. (This fact is unfortunately 
ignored in some textbooks.) However, one may argue that at / « X, the current should be largest in the 
feeding point (in Fig. 3, taken for z = 0), vanish at antenna’s ends (z = +1/2), and that the only possible 
scale of the current variation in the antenna is / itself, so that the linear function, 


0 ) 


1 



(8.30) 


gives a good approximation - as it indeed does. Now we can use the continuity equation dQ/dt = /, i.e. - 
icoQa, = la, to calculate the complex amplitude QJz) = iljz)sgn(z)/ co of the electric charge Q(z, t) = 
Re[0„ J cxp {-/Vu/[ ] of the wire beyond point z, and from it, the amplitude of the linear density of charge 


^)=%b=-« g „ z . 


d\z\ 


col 


From here, the dipole moment’s amplitude is 


1/2 


Pco =2 fxjz)zdz = 


JJ0) 


- 1 


2 co 


(8.31) 


(8.32) 


10 Actually, the formula needs a numerical coefficient adjustment to account for electron’s orbital (rather than 
linear) motion - the task left for reader’s exercise. However, this adjustment does not affect the order-of- 
magnitude estimate given above. 
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so that Eq. (28) yields 

^_ z ^ M0iy . z(*o 2 M0) 2 

12^-v 2 4 co 2 24 n 2 


(8.33) 


2 

where k = co/\. The analogy between this result and the dissipation power, 'P= ReZ (I 0) 12), in a lumped 
linear circuit element, allows the interpretation of the first fraction in the last form of Eq. (33) as the real 
part of antenna’s impedance: 

ReZ, (8.34) 

24;r 

as felt by the transmission line. (Indeed, according to Eq. (7.118), the wave traveling along the line 
toward the antenna is fully radiated, i.e. not reflected back, only if Za equals to Zw of the line.) As we 
know from Chapter 7, for typical TEM lines, Z w ~ Z 0 , while Eq. (34), that is only valid in the limit kl « 
1, shows that for radiation into free space (Z = Zo), ReZ.i is much less than Zo. 

Hence in order to reach the impedance matching condition Zw = Za, antenna’s length should be 
increased - as a more involved theory shows, to / ~ 2/ 2. However, in many cases, practical 
considerations make short antennas necessary. The most frequently met example met nowadays are the 
cell phone antennas, which use frequencies close to 1 or 2 GHz, with free-space wavelengths 2 between 
15 and 30 cm, i.e. much larger than the phone size. The quadratic dependence of antenna’s efficiency on 
/, following from Eq. (34), explains why every millimeter counts in the design of such antennas, and 
why the designs are carefully optimized using software packages for (virtually exact) numerical solution 
of time-dependent Maxwell equations for the specific shape of the antenna and other phone parts. 11 

To conclude this section, let me note that if the wave source is not monochromatic, so that p(7) 
should presented as a Fourier series, 

p(0 = ReXP^> (8.35) 

CO 


the terms corresponding to interference of spectral components with different frequencies co are 
averaged out at the time averaging of the Poynting vector, so that the average radiated power is just a 
sum of contributions (28) from all substantial frequency components. 


8.3. Wave scattering 

The formalism described above may be immediately used in the theory of scattering - the 
phenomenon illustrated by Fig. 4. Generally, scattering is a complex problem. However, in many cases 
it allows the so-called Born approximation , 12 in which scattered wave’s field applied to the scattering 
object is assumed to be much weaker than that of the incident wave, and is neglected. 


1 1 A partial list of popular software packages of this kind includes both publicly available codes such as NEC -2 
(whose various versions are available online, e.g., at http://alioth.debian.org/projects/necpp/ and 
http://www.qsl.net/4nec2/) , and proprietary packages - such as Momentum from Aglient Technologies (now 
owned by Hewlett-Packard), FEKO from EM Software & Systems, and XFdtd from Remcom. 

12 Named after M. Bom, one of the founding fathers of quantum mechanics. Note, however, the basic idea of this 
approach was developed much earlier (in 1881) by Lord Rayleigh - see below. 
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As the first example of this approach, let us consider scattering of a plane wave, propagating in 
free space (Z = Zq, v = c ), by a free 13 charged particle whose motion may be described by nonrelativistic 
classical mechanics. (This requires, in particular, the incident wave to be of a modest intensity, so that 
the speed of the induced charge motion is much less than the speed of light.) In this case the magnetic 
component of the Lorentz force (5.8), 

F,„=ryfxB, (8.36) 


exerted on the charge by the magnetic field of a plane wave, is much smaller than force F e = cjE exerted 
by its electric field. Indeed, according to Eq. (7.8), H= El Z = E/(/u/s) V2 , B = jul I =£7 v, so that the ratio 
FJF e equals to the ratio of particle’e speed, I r I , to wave’s speed v ~ c. 

Thus, assuming that the incident wave is linearly-polarized along axis x, the equation of 
particle’s motion in the Bom approximation is just mx = qE{t), so that for the x-component p x = qx of its 
dipole moment we can write 

p = qx = —E(t) . (8.37) 

m 


As we already kn ow from Sec. 2, oscillations of the dipole moment lead to radiation of a wave with a 
wide angular distribution of intensity; in our case this is the scattered wave - see Fig. 4. Its full power 
may be found by plugging Eq. (37) into Eq. (27): 


T = 



r 7 4 

Z 0 q 

6 k c 1 m 


- Eft ), i.e. F = Z ° q 


12 kc m 2 


(8.38) 


Full 

cross- 

section: 

definition 


Since the power is proportional to incident wave’s intensity S, it is customary to characterize 
scattering ability of the object by the ratio, 



(8.39) 


which evidently has the dimension of area and is called the full cross-section of scattering. For this 
measure, Eq. (38) yields the famous result 


13 As Eq. (7.30) shows, this calculation is also valid for an oscillator with eigenfrequency coq « co. 
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(8.40) 


which is called the Thomson scattering formula, 14 especially when applied to an electron. This relation 
is most frequently presented in the form 15 

Thomson 
(8.41) scattering 
formula 

Constant r c is called the classical radius of the particle (or sometimes the “Thomson scattering length”); 
for electron ( q = -e, m = m e ) it is close to 2.82xl0~ 15 m. Its possible interpretation is evident from the 
first form of Eq. (41) for r c : at that distance between two similar particles, the potential energy q /Ans^r 
of their electrostatic interaction is equal to particle’s rest-mass energy me , 16 

Now we have to go back and establish the conditions at which the Bom approximation, when the 
field of the scattered wave is negligible, is indeed valid for a point-object scattering. Since the scattered 
wave’s intensity, described by Eq. (26), diverges as Hr 2 , according to the definition (39) of the cross- 
section, it may become comparable to incident at r - cr. However, Eq. (38) itself is only valid if r » A, 
so that the Born approximation does not lead to any contradiction if 

a«A 2 . (8.42) 

For the Thompson scattering by an electron, this condition means A » r c - 3x 10' 15 m and is fulfilled for 
all frequencies up to very hard y rays with energies -100 MeV. 

Possibly the most notable feature of result (40) is its independence of the wave frequency. As it 
follows from its derivation, particularly from Eq. (37), this independence is intimately related with the 
unbound character of charge motion. For bound charges, say for electrons in a gas molecule, this result 
is only valid if the wave frequency <x> is much higher all eigenfrequencies coj of molecular resonances. In 
the opposite limit, co « <x>j, the result is dramatically different. Indeed, in this limit we can approximate 
the molecule’s dipole moment by its static value (3.39) 

p = 4ns 0 a mol E . (8.43) 

In the Born approximation, and in the absence of the molecular field effects discussed in Sec. 3.5, E in 
this expression is just the incident wave’s field, and we can use Eq. (28) to calculate the power of the 
wave scattered by a single molecule: 



14 Named after Sir J. J. Thom s on (1856-1940), the discoverer of the electron - and isotopes as well! He is not to 
be confused with his son, G. P. Thomson, who discovered (simultaneously with C. Davisson and L. Germer) 
quantum-mechanical wave properties of the same electron. 

15 In the Gaussian units, this formula looks like r c = q 2 /mc 2 (giving, of course, the same numerical value: for the 
electron, r c ~ 2.82x1 0' 13 cm). This classical quantity should not be confused with particle’s Compton wavelength 
A c = hlmc (for the electron, close to 2.24x1 O' 12 cm), which naturally arises in quantum electrodynamics - see a 
brief discussion in the next chapter, and QM Chapter 9 for more detail. 

16 It is fascinating how smartly has the relativistic expression me 2 sneaked into the result (40), which was obtained 
using a nonrelativistic equation of particle motion. This was possible because the calculation engaged 
electromagnetic waves that propagate with the speed of light, and whose quanta (photons ), as a result, may be 
frequently treated as relativistic (moreover, ultrarelativistic) particles - see the next chapter. 
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p _ 4 nZ^sl 


a 


mol 


(8.44) 


Now, using the last form of definition (39) of the cross-section, we get a very simple result, 


a = 


8/rZ^gO) 4 


3c 2 


a 


mol 


8 Tik 4 

~Y~ 


a 


2 

mol ? 


(8.45) 


showing that in contrast to Eq. (40), at low frequencies a grows as fast as of. 

Now let us explore the effect of such Rayleigh scattering 17 on wave propagation in a gas, with 
relatively low density n\. We can expect (and will prove in the next section) that due to the randomness 
of molecule positions, the waves scattered by each molecules may be treated as incoherent, so that the 
total scattering power may be calculated just as the sum of those scattered by each molecule. We can use 
this additivity to write the balance of the incident’s wave intensity on a small volume dV of length 
(along the incident wave direction) dz, and area A in across it. Since such a segment includes ndV = 
nAdz molecules, and, according to definition (39), each of them scatters power Sa = 'Zdj! A, the total 
scattered power is n Ztjdz\ hence the incident power’s change is 

d'P^-ncfPdz. (8.46) 


Comparing this equation with the general definition (7.202) of the attenuation constant, we see that 
scattering gives the following contribution to attenuation: a = na. From here, using Eq. (3.41) to write 
a mo \ = (<s>- l)/4 m, and Eq. (45), we get 


Rayleigh 

scattering 

formula 




(8.47) 


This is the famous Rayleigh scattering formula, which in particular explains the colors of blue 
sky and red sunsets. Indeed, through the visible light spectrum, co changes almost two-fold; as a result, 
scattering of blue components of sunlight is an order of magnitude higher than that of its red 
components. More qualitatively, for air near the Earth surface, s r - 1 « 6xl0' 4 , and n ~ 2.5xl0 25 m° - see 
Sec. 3.3. Plugging these numbers into Eq. (47), we see that the characteristic length / = Ha of 
scattering is -30 km for blue light and -200 km for red light. 18 The Earth atmosphere is thinner ( h - 10 
km), so that the Sun looks just a bit yellowish during most of the day. However, elementary geometry 
shows that on sunset, the light should pass length / - (R\ h) « 300 km to reach an Earth-surface 
observer; as a result, the blue components of Sun’s light spectrum are almost completely scattered out, 
and even the red components are weakened considerably. 


To conclude the discussion of Eq. (47), let me note that its comparison with the condition of the 
direct applicability of the Born approximation for a distributed object of size a: 


aa « 1 , 


(8.48) 


17 Named after Lord Rayleigh (born J. Stuff, 1842-1919), whose numerous contributions to science include the 
discovery of argon. He has also pioneered (for the special case we are considering now) the basic idea of what is 
presently called the Bom approximation. 

18 These values are approximate because both n and (s r - 1) vary through the atmosphere. 
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implies, in particular, that if the electric polarizability of the material is small, £•,.—» 1, we may be able to 
use the approximation for an analysis of scattering by even relatively large objects, with size of the order 
of, or even larger than X. However, for such extended objects, the phase difference factors (neglected 
above) step in, leading in particular to the important effects of interference and diffraction, to whose 
discussion we now proceed. 


8,4. Interference and diffraction 


These effects show up not as much in the total power of scattered radiation, as in its angular 
distribution. It is traditional to characterize this distribution by the differential cross-section defined as 



(8.49) 


Differential 

cross- 

section: 

definition 


where r is the distance from the scatterer, at which the scattered wave is observed. Both the definition 
and notation become more clear if we notice that according to Eq. (26), at large distances (r » a), the 
nominator in the right-hand part of Eq. (49), and hence the differential cross-section as the whole, does 
not depend on r, and that its integral over the total solid angle Q = 4 n coincides with the total cross- 
section defined by Eq. (39): 

i^-dQ. = =^r 2 iSrdSl = — - — if .d 2 r ==Z= = <7. (8.50) 

J do. s J s J s 

4 k ‘“'incident 4 k ° incident r=const ‘“'incident 


For example, according to Eq. (26), the angular distribution of radiation scattered by a point 
linear dipole, in the Born approximation, is rather broad; in particular, in the low-frequency limit (43), 


d<7 1 4 2 -2 

= k « mn l Sln 

dO. mo1 


0 . 


(8.51) 


If the wave is scattered by a small dielectric body, with a characteristic size a « X (i.e., ka « 1), then 
all its parts re-radiate the incident wave coherently. Hence, we can calculate it in the similar way, just 
replacing the molecular dipole moment (43) with the total dipole moment of the object - see Eq. (3.37): 

p = PE = (e r -l)f 0 EF, (8.52) 

where V~ a is body’s volume. As a result, the differential cross-section may be obtained from Eq. (51) 
with the replacement a mo \ — » V(s r - I )IAjt. 

rS i£ '- ,)!sin,# ' (8 - 53) 

i.e. follows the same sin"# law. The situation for extended objects, with at least one dimension of the 
order, or larger than the wavelength, is different: here we have to take into account that the phase shifts 
introduced by various parts of the body are different. Let us analyze this issue for an arbitrary collection 
of similar point scatterers located at points r y. 

If wave vector of the incident plane wave is k 0 , the field the wave has the phase factor 
cxp{/k 0 -r | - see Eq. (7.79). At the location of y'-th scattering center, the factor equals to cxpj/ko-r;}, so 
that the local polarization vector p, and the scattered wave it creates, are proportional to this factor. On 
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Scattering 

function 


Phase 

sum 


its way to the observation point r, the scattered wave, with wave vector k (with k = k 0 ), acquires an 
additional phase factor exp{ik-(r - r y ) } , so that the scattered wave field is proportional to 

exp{z'k 0 -i v +ik(r-iv)} = exp{/(k 0 -k) •r / + zk -r} = e ,k r exp{-z'(k -k 0 ) -r y } . (8.54) 

Since the first factor in the last expression does not depend on r y, in order to calculate the total scattering 
wave, it is sufficient to sum up the elementary phase factors exp{-iq-ry}, where vector 

qsk-k 0 (8.55) 


has the physical sense of the wave vector change at scattering. 19 It may look like the phase factor 
depends on the choice of origin. However, according to Eq. (7.42), the average intensity of the scattered 
wave is proportional to Ea,E a , i.e. to the following real scalar function of vector q: 



f \ 

f X 

* 

*Xq) = 

^cxp{-/q-r/} 

X exp { _/ q ' r r } 

= X ex p{*q-( r ; -*■/)} = |/(q)f, 


w J 

V j" J 

j»f 


(8.56) 


where the complex function 

J(q) = Z ex P{-“T r ;} 

J_ 


(8.57) 


is called the phase sum, may be calculated within any reference frame, without affecting the final result 
(56). The double-sum form of Eq. (56) is convenient to notice that for a system of many (N » 1) of 
similar but randomly located scatterers, only the terms with j =j ’ accumulate at summation, so that F(q) 
scales as N, rather than N - thus justifying the above treatment of the Rayleigh scattering problem. 

Let us start using Eq. (56) by applying it to the simplest problem of just two similar small 
scatterers, separated by a fixed distance a: 

2 Cl 

F(q) = ^expjz'q • (r y - r ; ,)} = 2 + exp {-iq a a} + exp{iq n a} = 2(l + cosq a a) = 4cos 2 , (8.58) 

jj '= i 2 


where q a = qa la is the component of vector q along vector a connecting the scatterers. The apparent 
simplicity of this result may be a bit misleading, because the mutual plane of vectors k and k 0 (and 
hence of vector q) does not necessarily coincide with the mutual plane of vectors ko and E„„ so that the 
scattering angle a between vectors k and k 0 is generally different from ( nil - 6) - see Fig. 5. 



Fig. 8.5. Angles important for the general 
scattering problem. 


19 In quantum electrodynamics, tiq has the sense of the momentum transferred from the scattering object to the 
scattered photon, and this terminology sometimes creeps even into the classical electrodynamic texts. 
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Moreover, vectors q and a may have another common plane, and angle between them is one 
more parameter that may be considered as independent from both a and 0. As a result, the angular 
dependence of the scattered wave’s intensity (and hence da/dCl), that depends on all three angles, may 
be rather complex. 

This is why let me consider only the simple case when vectors k, k 0 , and a are all in the same 
plane (Fig. 6a), with k 0 perpendicular to a (leaving the general analysis for readers’ exercise). Then, 
with our choice of coordinates, q a = q x = ksina, and Eq. (58) is reduced to 


F(q) = 4 cos 2 


ka sin a 
2 


(8.59) 


This function always has two maxima, at a = 0 and a = n, and possibly (if the product ka is large 
enough) other maxima at special angles a n that satisfy the famous Bragg condition 20 


kasma n =27m, i.e. asina (1 = nX . 


(8.60) 



Fig. 8.6. The 
simplest geometries 
for (a) interference 
and (b) diffraction. 


As evident from Fig. 6a, this condition may be readily understood as the in-phase addition 
(frequently called the constructive interference) of two coherent waves scattered by the two points, 
when the difference between their paths toward the observer, asina, equals to an integer number of 
wavelengths. At each such maximum, F = 4, due to the doubling of the wave amplitude and hence 
quadrupling its power. 

If the distance between the point scatterers is large (ka » 1), the first Bragg maxima correspond 
to small angles, a «1. For this region, Eq. (59) in reduced to a simple sinusoidal dependence of 
function F on angle a. Moreover, within the range of small a, the polarization factor sin 0 is virtually 
constant, so that the scattered wave intensity, and hence the differential cross-section 


WO \ a 2 

oc F(q) = 4 cos 

dQ 


kaa 

~Y~ 


(8.61) 


This is of course the well- kn own interference pattern, well known from the Young’s two-slit 
experiment. 21 (As will be discussed in the next section, theoretical description of the two-slit experiment 


20 Named after Sir William Bragg and his son, Sir William Lawrence Bragg, who in 1912 demonstrated X-ray 
diffraction by atoms in crystals. The Braggs’ experiments have made the existence of atoms (before that, a 
hypothetical notion ignored by many physicists) indisputable. 


Bragg 

condition 


Young’s 

interference 

pattern 
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Phase 

integral 


Fraunhofer 

diffraction 

integral 


Sine 

function 


is more complex than that of the Bom scattering, but is preferable experimentally, because at scattering, 
the wave of intensity (61) has to be observed on the backdrop of a stronger incident wave that 
propagates in almost the same direction, a= 0.) 

The Bragg condition (60) does not change at scattering from N> 2 similar, equidistant scatterers, 
located along the same straight line (because the condition is applicable to each pair of adjacent 
scatterers), but the interference pattern changes. Leaving the analysis of the case of arbitrary N for 
reader’s exercise, let me jump to the limit N — » 0, in which we may ignore the scatterer discreteness. 
The resulting pattern is similar to that at scattering by a continuous thin rod, so let us first discuss the 
Born scattering by an arbitrary distributed object - say an extended dielectric body with a constant value 
of Transferring Eq. (56) from the sum to an integral, for the differential cross-section we get 

— = - l) 2 F(q) sin 2 9 = ~^{s r -l) 2 |/(q)| 2 sin 2 0 , (8.62) 

dQ. (4 tt) 2 (4;t) 2 1 1 


where 7(q) now becomes the phase integral, 21 


/(q) = J exp{-z'q -r'}d 3 r ' , 

v 


(8.63) 


with the dimensionality of volume. 


Now we may return to the particular case of a thin rod (with both dimensions of the cross- 
section’s area much smaller than X, but an arbitrary length a), otherwise keeping the same simple 
geometry as for two point scatterers - see Fig. 6b. In this case the phase integral is just 


+al 2 


/(q) = A J exp {-iq x x')dx' = A 


exp \-iq x a! 2} - exp {-iq x a / 2} _ sin d, 


-a! 2 


■iq 


4 


(8.64) 


where V= Aa is the volume of the rod, and g is a dimensionless parameter defined as 

„ q v a ka sin a 
= 1 • 


(8.65) 


The fraction participating in Eq. (64) is met in physics so frequently that is has deserved the special 
name sine (not “sync”, please!) function : 


sinc^ 


mu 

T~ 


( 8 . 66 ) 


Obviously, this function, plotted in Fig. 7, vanishes at all points = m, with integer n, besides point n 
= 0: sinc^o = sine 0=1. 


21 This experiment was described as early as in 1803 by T. Young - one more universal genius of science, who 
has also introduced the Young modulus in the elasticity theory (see, e.g., CM Chapter 7), besides numerous other 
achievements - including deciphering Egyptian hieroglyphs! The two-slit experiment has firmly established the 
wave picture of light, to be replaced by the dualistic photon-vs-wave picture, formalized by quantum 
electrodynamics, only 1 00+ years later. 

22 Since the observation point’s position r does not participate in this formula explicitly, the prime sign in r ’ could 
be dropped, but I keep it as a reminder that the integral is taken over points r ’ of the scattering object. 
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2 2 

The function F(q) = V sine £ resulting from Eq. (64), is plotted by red line in Fig. 8, and is 
called the Fraunhofer diffraction pattern. 


^(q) 

V 



Fig. 8.8. The Fraunhofer diffraction 
pattern (solid red line) and its envelope 
\lf (dashed line). For comparison, the 
blue line shows the standard 
interference pattern cos 2 c - cf. Eq. (59). 


Note that it oscillates with the same argument period A(£asina) = India « 1 as the interference 
pattern (59) from two point scatterers (shown with the blue line in Fig. 8). Flowever, at the interference, 
the scattered wave intensity vanishes at angles a n ’ that satisfy condition 


ka sin a' n 
In 



(8.67) 


when the optical paths difference asin a equals to a semi-integer number of wavelengths 2/2 = n/k, and 
hence the two waves from the scatterers arrive to the observer in anti-phase (the so-called destructive 
interference). On the other hand, for the diffraction from a continuous rod the minima occur at a 
different set of angles, 


ka sin a n 
2 n 


= n , 


( 8 . 68 ) 


i.e. exactly where the two-point interference pattern has its maxima. The reason for this relation is that 
the wave diffraction on the rod may be considered as a simultaneous interference of waves from all its 
fragments, and exactly at the observation angles when the rod edges give waves with phases shifted by 
2 mi, the interior point of the rod give waves with all possible phases, with their algebraic sum equal to 
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zero. Even more visibly in Fig. 8, at diffraction the intensity oscillations are limited by a rapidly 
decreasing envelope function l/£ . The reason for this fast decrease is that with each Fraunhofer 
diffraction period, a smaller and smaller fraction of the road gives an unbalanced contribution to the 
scattered wave. 

If rod’s length is small ( ka « 1, i.e. a « X), then sine’s argument c is small at all scattering 
angles a, so /( q) « V, and Eq. (64) is reduced to Eq. (53). In the opposite limit, a » A, the first zeros of 
function /(q) correspond to very small angles a , for which sin <9 « 1, so that the differential cross- 
section is 


da k V 2 . ; 

— = T (e-l) sine 

dQ. {An ) 1 


kaa 


(8.69) 


i.e. Fig. 8 shows the scattering intensity as a function of the diffraction direction - if the pattern is 
observed within the plane containing the rod. 


8.5. The Huygens principle 

The Born approximation allows tracing the basic features of (and the difference between) the 
phenomena of interference and diffraction. Unfortunately, this approximation, based on the relative 
weakness of the scattered wave, cannot be used for more popular experimental implementations of these 
phenomena, for example, the Young’s two-slit experiment, or diffraction on a single slit or orifice - see, 
e.g. Fig. 9. Indeed, at such experiments, the orifice size a is typically much larger than light’s 
wavelength, and as a result, no clear decomposition of the fields to the incident and “scattered” waves is 
possible. 



Fig. 8.9. Typical geometry 
for the Huygens principle 
application. 


However, for such experiments, another approximation, called the Huygens (or “Huygens- 
Fresnel”) principle , 23 is very instrumental: the passed wave may be presented as a linear superposition 
of spherical waves of the type (17), as if they were emitted by every point of the orifice (or more 
physically, by every point of the incident wave’s front that has arrived at the orifice). This 
approximation is valid if the following strong conditions are satisfied: 


23 Named after C. Huygens (1629-1695) who had conjectured the wave theory of light (that remained 
controversial for more than a century, until T. Young’s experiments), and A.-J. Fresnel (1788-1827) who has 
developed the mathematical theory of diffraction. 
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A «a «r , (8.70) 

where r is the distance of the observation point from the orifice. In addition, as we have seen in the last 
section, at small Ala the diffraction phenomena are confined to angles a ~ l/ka ~ A/a « 1. For 
observation at such small angles, the mathematical expression of the Huygens principle, for a complex 
amplitude fjr) of a monochromatic wave /(r, t) = R e[faP' uot \, is given by the following simple fonnula 

ikR 

/„(>•) = C | (8.71) 

orifice 

Here / is any transverse component of any of wave’s fields (either E or H ), 24 R is the distance between 
point r ’ at the orifice and the observation point r (i.e. the magnitude of vector R = r - r ’), and C is a 
complex constant. 

Before describing the proof of Eq. (71), let me carry out its sanity check - which also will give us 
the constant C. Let us see what happens if the field under the integral is the usual plane wave fj(z) 
propagating along axis z (i.e. there is no opaque screen at all), so we should take the whole x-y plane, 
say with z ’ = 0, as the integration area (Fig. 10). 



Fig. 8.10. The Huygens 
principle applied to a plane 
wave. 


Then, for the observation point with coordinates x = 0, y = 0, and z » A, Eq. (71) yields 


fJ z ) = CfJQ)\dx^dy 


, exp 


i4 


x' 2 +y' 2 +z 2 






(8.72) 


Before specifying the integration limits, let us consider the range \x’\, \y’\ « z. In this range the square 
root, met in Eq. (72) twice, may be approximated as 


[x' 2 +y' 2 +z 2 


J/2 


1 + 


x' 2 +y' 2 ^ 


1/2 


1 + 


x' 2 + y' 2 ^ 


2z 


= z + 


x' 2 +y' 2 


2 z 


(8.73) 


The denominator of Eq. (72) is a much slower function of x ’ and y ’ than the exponent, and in it (as we 
will check a posteriori), it is sufficient to keep just the main, first term of expansion (73). With that, Eq. 
(72) becomes 


24 The fact that the Huygens principle is valid for any field component should not too surprising. Due to condition 
a » A, the real boundary conditions at the orifice edges are not important; what is only important that the screen, 
that limits the orifice, is opaque. Because of this, the Huygens principle’s expression (71) is a part of the so-called 
scalar theory of diffraction. (In this course I will not have time to go beyond this approximation.) 
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ikz 


/ ffl O) = CfJO) 1 dx'j dy' exp 


ik(x' 2 +y' 2 ) _ 


ikz 


2 z 


CIS 0 )- 


Vy’ 


(8.74) 


where I x and I v are two similar integrals; for example, 


f ={exp^-c/x'= yl Jexp{/f}^ = fyl [[ cos(^ 2 )/£ + /[ sin(^ 2 )/£ , (8.75) 


1/9 

where £ = {kHz) . These are the so-called Fresnel integrals. I will discuss them in more detail in the 
next section, and right now, only one property of these integrals is important for us: if taken in 
symmetric limits [-go, +go\, both of them rapidly converge to the same value, (n/2) , as soon as go 
becomes much larger than l. 25 This means that even if we do not impose any exact limits on the 
integration area in Eq. (72), this integral converges to value 


f co ( z ) = Cf a (0) - 


2z 


y k j 


f n H ' 2 


+ l 






V z 7 


c 


2 ni 


fMe 


ikz 


(8.76) 


due to contributions from the central area with linear size of the order of A g~ 1, i.e. 


Ax ~ Ay 


f-T /2 

\k) 


fa r , 


(8.77) 


so that the contribution by front points r’ well beyond the range (77) is negligible. 26 (Within our 
assumptions (70), which in particular require A to be much less than z, the diffraction angle Ax/z ~ A y/z 
~ (AJz) , corresponding to the important area of the front, is small.) In order to sustain the plane wave 
propagation, /Jz) =fjf))e lkz , constant C in Eq. (76) has to be taken equal to k!2m. Thus, the Huygens 
principle’s prediction (71), in its final form, reads 


Huygens 

principle’s 

expression 



(8.78) 


and describes, in particular, the straight propagation of the plane wave (in a uniform media). 


Let me pause to emphasize how nontrivial this result is. It would be a natural corollary of Eq. 
(25) (and the linear superposition principle) if all points of the orifice were filled with point scatterers 
that re-emit all the incident waves into spherical waves. However, as it follows from the above proof, 
the Huygens principle is also valid if there is nothing in the orifice but the free space! 


This is why it is important a proof of the principle, 27 based on the Green’s theorem (2.207). Let 
us apply this theorem to function /=/ w , where f 0 > is the complex amplitude of a scalar component of one 
of wave’s fields, which satisfies the Helmholtz equation (7.192), 


25 See, e.g., MA Eq. (6.10). 

26 This result very is natural, because exp {ikR} oscillates fast with the change of r ’, so that the contributions from 
various front point are averaged out. Indeed, the only reason why the central part of plane [x\ y ’] gives a 
nonvanishing contribution (76) to fjz) is that the phase exponents stops oscillating at (x’ 2 +y’ 2 ) below ~z/k- see 
Eq. (73). 

27 This proof was given in 1882 by G. Kirchhoff. 
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(v 2 +k 2 )/„(r) = 0 , 


(8.79) 


and function g = g„„ which is the time Fourier image of the corresponding Green’s function. It may be 
defined, as usual, as the solution to the same equation with the added delta-functional right-hand part 
with an arbitrary coefficient, for example, 

(V 2 + k 2 )g ffl (r,r') = -4n8(r - r') . (8.80) 


With Eqs. (79) and (80) used to express the Laplace operators of functions f m and g n „ Eq. (2.207) 
becomes 


J [fa, [- k 2 g m (r , r ') - 4/rS(r - r ')] - g m (r, r ')[- k 2 f io ]} d V = 



dg>,rQ 

dn 


g, 


,(r,r 'f‘ 


dn 


d 2 i 


(8.81) 


where n is the outward normal to the surface S limiting volume V. Two terms in the left-hand side of this 
relation cancel, so that after swapping r and r ’ we get 





#»( a 

dn' 


J V . 


(8.82) 


This relation is only correct if the selected volume V includes point r (otherwise we would not 
get its left-hand part from the integration of the delta-function), but does not include the genuine source 
of the wave (otherwise Eq. (79) would have a nonvanishing right-hand part). Let r be the field 
observation point, V all the source-free half-space (for example, the half-space right of the screen in Fig. 
9), so that S is the surface of the screen, including the orifice. Then the right-hand part of Eq. (82) 
describes the field in the observation point r induced by the wave passing through the orifice points r ’. 
Since no waves are emitted by the opaque parts of the screen, we can limit the integration by the orifice 
area. 28 Assuming also that the opaque parts of the screen do not re-emit waves “radiated” by the orifice, 
we can take the solution of Eq. (80) to be the retarded potential for the free space: 29 

ikR 

g (0 ( r,r') = V- (8.83) 

K 


Plugging this expression into Eq. (82), we get 
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fjr') — 

(e m } 


(e ,kR ^ 

dfjr') 
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orifice 

m dn' 

l R J 


[ R j 
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(8.84) 


Kirch hoff 
integral 


This is the so-called Kirchhoff (or “Fresnel-Kirchhoff’) integral . 30 Now, let us make the two 
additional approximations. The first of them stems from Eq. (70): at ka » 1, the wave’s spatial 
dependence in the orifice area may be presented as 


28 Actually, this is a somewhat nontrivial point of the proof. Indeed, it may be shown that the solution of Eq. (79) 
identically equals to zero i f /( r ’) and df{ r ')/5n ’ vanish together at any part of the boundary. As a result, building 
the solution with the account of exact boundary conditions (which is the task of the vector theory of diffraction) is 
possible but cumbersome. Here we base our solution on the physical intuition. 

29 It follows, e.g., from Eq. (16) with a monochromatic source q(t) = q c fix^>{-icot}, at the value q m = 4 ns that fits 
the right-hand part of Eq. (80). 

30 With the integration extended over all boundaries of volume V, this would be an exact result. 
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f C) ( r') = (a slow function of r') x exp{/k 0 • r'} , 


(8.85) 


where “slow” means a function that changes on the scale of a rather than A, If, also, kR » 1, then the 
differentiation in Eq. (84) may be, in both instances, limited to the rapidly changing exponents, giving 


pikR 

~ f co (r) = | ?( k + k o ) • n ' - — /(r 1 V V ' , 


( 8 . 86 ) 


Second, if all observation angles are small, we can take kn ’ « k 0 n ’ « -k. With that, Eq. (86) is reduced 
to Eq. (78) expressing the Huygens principle. 

It is clear that the principle immediately gives a very simple description of the interference of 
waves passing through two small holes in the screen. Indeed, if the hole size is negligible in comparison 
with distance a between them (though still much larger than the wavelength!), Eq. (78) yields 


r , , ikR, ikR-, 
fS r) = c t e +c 2 e 


with Cj 2 


1 , 2 ^ 1, 2 

2mR x 2 


(8.87) 


where Rip are the distances between the holes and the observation point, and A\p are the hole areas. For 
the interference wave intensity, Eq. (87) yields 

S « Ul =h| 2 + \ c i\ + 2|c 1 |c 2 |cos[k(i? 1 -R 2 ) + <p\, (p = &xgc x -argc 2 . (8.88) 


The first two tenns in this result clearly represent the intensities of partial waves passed through each 
hole, while the last one the result of their interference. The interference pattern’s contrast ratio 




S max 

p 


+ 

C 2 

S min 

u 

c i 

- 

Cl 


(8.89) 


is largest (infinite) when both waves have equal amplitudes. 


The analysis of the interference pattern is simple if the line connecting the holes is perpendicular 
to wave vector k « ko - see Fig. 6a. Selecting the coordinate axes as shown in that figure, and using for 
distances R\p the same expansion as in Eq. (73), for the interference tenn in Eq. (88) we get 


cos[k(i?, - R 2 ) + cp \ « cos 


kxa 


■ + cp 


(8.90) 


This means that the intensity does not depend on y, i.e. the interference pattern in the plane of constant z 
presents straight, parallel strips, perpendicular to vector a, with the period given by Eq. (60), i.e. by the 
Bragg law. 31 Note that this (somewhat counter-intuitive) result is strictly valid only at (x 2 +y 2 ) « z 2 ; it 
is straightforward to use the next term in the Taylor expansion (73) to show that farther from the 
interference pattern center the strips start to diverge. 


31 The phase shift cp vanishes at the normal incidence of a plane wave on the holes. Note, however, that the 
spatial shift of the interference pattern following from Eq. (90), Ax = -(z/ka)<p, is extremely convenient for the 
experimental measurement of the phase shift between two waves, especially if it is induced by some factor (such 
as insertion of a transparent object into one of interferometer’s arms, etc.) that may be turned on/off at will. 
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8.6. Diffraction on a slit 

Now let us use the Huygens principle to analyze a more complex problem: plane wave’s 
diffraction on a long straight slit of constant width a (Fig. 1 1). 


wave 


\ 

screen with 
a slit . 

diffract 

wave 

ed x 

/ 

\ 

observation 

plane 

+ a / _2JL_ — — - 



\ 

- n / 91 — ~ 


W 

/ 

z 

— (1 / z 

• 


Fig. 8.1 1. Diffraction on a slit. 


According to Eq. (70), in order to use the Huygens principle for the problem analysis we need to 
have X « a « z. Moreover, the simple formulation (78) of the principle is only valid for small 
observation angles, | x \ « z. Note, however, that the relation between two small dimensionless 
numbers, z/a and a/ A is so far arbitrary; as we will see in a minute, this relation will detennine the type 
of the observed diffraction pattern. 


Let us apply Eq. (78) to our current problem (Fig. 11), for the sake of simplicity assuming the 
nonnal wave incidence, and taking z = 0 at the screen plane: 


i ru tvj 

L( x ’Z) = foj— \ dx ' \ d y 


, exp 


jz'A:[(x-x') 7 + y r +z 2 1 ~ [ 


[(x-x') 2 + y' 7 + z 7 


(8.91) 


where f = f ix 0) = const is the incident wave’s amplitude. This is the same integral as in Eq. (72), 
except for the finite limits for x ’, and may be simplified similarly, using the small-angle condition (x - 

x ) + y «z : 


/(x,z)«/ 0 


k e 
2 ni 


ikz +a / 2 


J bx'Jjv'exp 


ik [(.' 


ik\(x-x') z +y’ 7 ] _ f k 
— Jo 


ikz 


j/2 


2 z 


2 ni 


-V, 


(8.92) 


The integral over y is the same as in the last section: 


+00 - 7/2 

ly = Jexp^— Jv' = 


2 m z 


, 1/2 


but the integral over x is more complicated, because of its finite limits: 

ik(x-x') 2 


/. = 1 


exp- 


-a/2 


2 Z 


-dx' . 


(8.93) 


(8.94) 


It may be simplified in the following two (opposite) limits. 
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Fraunhofer 

diffraction 

pattern 


Fresnel 

integrals 


(i) Fraunhofer diffraction takes place when z/a » a! A - the relation which may be rewritten 
either as a « (za) ', or as ka~ « z. In this limit the ratio kx ’ lz is negligibly small for all values of x ’ 
under the integral, and we can approximate it as 


+a/2 


I = 


exp 


ik{x 2 -2xx' + x' 2 ) 


-all 


2 z 


\-u / 
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ik(x 2 -2xx') , , 
exp : dx 
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2 Z 


= exp- 


ikx 


2 +al 2 


2 Z 
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\dx - — exp< - 
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kx I 2 z 


>sin- 


kxa 
2 z 


so that Eq. (92) yields 


fa,( X ’ Z )~f 0 


k e lkz 2 z 
In i z kx 


^ 2niz\ 


\ K J 


1/2 \ikx 2 1 

exp lir| 


sin- 


kxa 
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(8.95) 


(8.96) 


and hence the relative wave intensity is 


5(x,z) 

f a (x,z) 

2 , 

8z . , kxa 2 ka . 

' kaaf 

> 

So 

fo 

Tjkx" 2 z n z 

l 2 y 


(8.97) 


where .S’o is the (average) intensity of the incident wave, and a = x/z « 1 is the scattering angle. 
Comparing this expression with Eq. (69), we see that this the diffraction pattern is exactly the same as 
that of a similar (unifonn, ID) object in the Born approximation - see the red line in Fig. 8. Note again 
that the angular width 8a of the Fraunhofer pattern is of the order of l/ka, so that its linear width Sx = 
z5a ~ z/ka ~ z/Ja? 2 Hence the condition of the Fraunhofer approximation validity may be also presented 
as a « Sx. 


(ii) Fresnel diffraction. In the opposite limit of a relatively wide slit, with a» Sx = zSa ~ z/ka ~ 
zX/a, i.e. ka » z, the diffraction patterns at two slit edges are well separated. Hence, near each edge 
(for example, near x ’ = -a/2) we may simplify Eq. (94) as 


-|-uu 

/ X (x)~ J exp 


ik{x-x'f (2 z 
dx = 


, 1/2 


-all 


2 z 


V K 


{kHz) 


1-cu 

J expj/^ 2 }df , 

2 (x+a/ 2) 


(8.98) 


and express it via the special functions called the Fresnel integrals '. 33 



(8.99) 


whose plots are shown in Fig. 12. As was mentioned above, at large values of their argument (f), both 
functions tend to Vi. 


32 Note also that since in this limit ka 2 « z, Eq. (97) shows that even the maximum value 5(0, z) of the diffracted 
wave intensity is much less than intensity 5o of the incident wave. This is natural, because the incident power Soa 
per unit length of the slit is now distributed over a much larger width 8x » a, so that 5(0, z) ~ 5 0 (a! Sx) « 5 0 . 

33 Slightly different definitions of these functions, mostly affecting constant factors, may also be met in literature. 
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Plugging this expression into Eq. (92) and (98), for the diffracted wave intensity, in the Fresnel 
limit (i.e. at | x + all \ « a), we get 


S(x,z) 1 
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( 8 . 100 ) 


A plot of this function (Fig. 13) shows that the diffraction pattern is very peculiar: while in the “shade” 
region x < -all the wave intensity fades monotonically, the transition to the “light” region within the gap 
(x > -all) is accompanied by intensity oscillations, just as at the Fraunhofer diffraction - cf. Fig. 8. 



Fig. 8.13. Fresnel 
diffraction pattern. 


This behavior, which is described by the following asymptotes, 
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( 8 . 101 ) 


is essentially an artifact of observing just the wave intensity (i.e. its real amplitude) rather than its phase 
as well. Indeed, as may be seen even more clearly from the parametric presentation of the Fresnel 


Fresnel 

diffraction 

pattern 
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integrals (Fig. 14), these functions oscillate similarly at large positive and negative values of their 
argument. Physically, this means that the wave diffraction by the slit edge leads to similar oscillations of 
its phase at x > -a/2 and x < -a/2; however, in the latter region (i.e. inside the slit) the diffracted wave 
overlaps the incident wave passing through the slit directly, and their interference reveals the phase 
oscillations, making them visible in the measured intensity as well. 





Fig. 8.14. Parametric representation of the 
Fresnel integrals. This pattern is called 
either the Euler spiral or Cornu spiral. 


Note that according to Eq. (100), the linear scale of the Fresnel diffraction pattern is (2 z/k) 1/2 , 
i.e. is complied with estimate (77). If the slit is gradually narrowed, so that width a becomes comparable 
to that scale, 34 the Fresnel interference patterns from both edges start to “collide” (interfere). The 
resulting wave, fully described by Eq. (94), is just a sum of two contributions of the type (98) from the 
both edges of the slit. The resulting interference pattern is somewhat complicated, and only a « 8x it is 
reduced to the simple Fraunhofer pattern (97). Of course, this crossover from the Fresnel to Fraunhofer 
diffraction may be also observed, at fixed wavelength X and slit width a, by increasing z, i.e. by 
measuring the diffraction pattern farther and farther from the slit. 


Note that the Fraunhofer limit is always valid if the diffraction measured as a function of the 
diffraction angle a alone, i.e. effectively at infinity, z — > oo. This may be done, for example, by 
collecting the diffracted wave with a “positive” (converging) lense, and observing the diffraction pattern 
in its focal plane. 


8.7. Geometrical optics placeholder 

Behind all these details, I would not like the reader to miss the main feature of diffraction, that 
has an overwhelming practical significance. Namely, besides narrow diffraction “cones” (actually, 
parabolic-shaped regions) with lateral scale Ax ~ (Xz) , the wave far behind a slit of width a » X 
repeats the field just behind the slit, i.e. reproduces the unperturbed incident wave inside the slit, and has 
negligible intensity in the shade regions outside it. An evident generalization of this fact is that when a 
plane wave (in particular an electromagnetic wave) passes any opaque object of large size a » X, it 
propagates around it, by distances z up to ~(a/X) ", along straight lines, with virtually negligible 


34 Note that this condition may be also rewritten as a ~ Sx, i.e. z/a ~ alX. 
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diffraction effects. This fact gives the strict foundation for the very notion of the wave ray (or beam), as 
the line perpendicular to the local front of a quasi-plane wave. In a uniform media such ray is a straight 
line, but changes in accordance with the Snell law at the interface of two media with different wave 
speed v, i.e. different values of the refraction index. The notion of rays enables the whole field of 
geometric optics, devoted mostly to ray tracing in various (sometimes very complex) systems. 

This is why, at this point, an E&M course that followed the scientific logic more faithfully than 
this one, would give an extended discussion of the geometric and quasi-geometric optics, including (as a 
minimum 35 ) such vital topics as 

- the so-called lensmaker ’s equation expressing the focus length/ of a lens via the curvature radii 
of its spherical surfaces and the refraction index of the lens material, 

- the thin lense formula relating the image distance from the lense via / and the source distance, 

- the concepts of basic optical instruments such as telescopes and microscopes, 

- the concepts of the spherical, angular, and chromatic aberrations (image distortions); 

- wave effects in optical instruments, including the so-called Abbe limit 36 on the focal spot size. 37 

However, since I have made a (possibly, wrong) decision to follow the common tradition in 
selecting the main topics for this course, I do not have time left for such discussion. Still, I am placing 
this “placeholder” pseudo-section to relay my conviction that any educated physicist has to know the 
geometric optics basics. If the reader has not had an exposure to this subject during his or her 
undergraduate studies, I highly recommend at least browsing one of available textbooks. 38 


8.8. Fraunhofer diffraction from more complex scatterers 


So far, our discussion of diffraction has been limited to a very simple geometry - a single slit in 
an otherwise opaque screen (Fig. 11). However, in the most important Fraunhofer limit, z » ka", it is 
easy to get a very simple expression for the plane wave diffraction/interference by a plane orifice (with 
linear size ~a) of an arbitrary shape. Indeed, the evident 2D generalization of approximation (93)-(94) is 


[ J y = J exp 


ik 


(x-x'f+(y-y'f 


orifice „ 


2 z 


exp- 


ik( x 2 + y 2 ) 
2 z 


exp j - 1 


' orifice 


dx'dy' 

kxx' .kyy' j 
z z 


dx'dy', 


( 8 . 102 ) 


35 Admittedly, even this list leaves aside several spectacular effects due to crystal anisotropy, including such a 
beauty as conical refraction in biaxial crystals - see, e.g., Chapter 15 of the classical textbook by M. Bom and E. 
Wolf, cited in the end of Sec. 7. 1 . 

36 Reportedly, due to not only E. Abbe (1873), but also to H. von Helmholtz (1874). 

37 In contrast to other topics of this list, whose study may be based on the ray approach, i.e. on purely geometric 
optics, the description of these effects requires at least an approximate account of wave properties of light. Such 
account may be based either on the Huygens principle or on the so-called paraxial equation 

dal dz = (1/ 2ik)V 2 K y a , 

for the complex amplitude a(r) of the field represented in the form /( r) = a( r)e' kz . The paraxial approximation 
follows from the Helmholtz equation (7.192) in essentially the same limit (|Va| «k; I x \ , \y \ «z) as Eq. (78). 

38 My top recommendation for that purpose would be Chapters 3-6 and Sec. 8.6 in Bom and Wolf. A simpler 
alternative is Chapter 10 in G. R. Fowles, Introduction to Modern Optics, 2 nd ed., Dover, 1989. 
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so that besides the inconsequential total phase factor, Eq. (92) is reduced to 

General 
Fraunhofer 
diffraction 
pattern 


fip) K fo Jexpj-z'K-p '}d 2 p' = f 0 jE(p') exp{-z'K -p'}d 2 p ' , 

orifice screen 


(8.103) 


where the 2D vector k (not to be confused with wave vector k that is virtually perpendicular to k!) is 
defined as 


k = k— « q = k -k 0 , 
z 


(8.104) 


p = {x, y} and p ’ = {x\ y’} are 2D radius-vectors in, respectively, the observation and screen planes 
(both nearly normal to vectors k and ko), function T( p ’) describes screen’s transparency at point p ’, and 
the last integral in Eq. (103) is over the whole screen plane z’ = 0. (Though the strict equivalence of the 
two forms of Eq. (103) is only valid if T( p ’ ) equals to either 1 or 0, its last form may be readily obtained 
from Eq. (78) wither’) = T(p ’ )/ 0 for any transparency profile, provided that 7Tp ’ ) is an arbitrary 
function but changes only at distances much larger than X = 2n!k.) 

From the mathematical point of view, the last form of Eq. (103) is the 2D spatial Fourier 
transform of function T(p ’), with the reciprocal variable k revealed by the observation point position: p 
= (z/k ) k = (zX!2tz)k.. This interpretation is useful because of the experience we all have with the Fourier 
transform, mostly in the context of its time/frequency applications. For example, if the orifice is a single 
small hole, T(p’) may be approximated by a delta-function, so that Eq. (103) yields /(p) * const. This 
corresponds (at least for the small diffraction angles a = plz, for which the Huygens approximation is 
valid) to a spherical wave spreading from the point-like orifice. Next, for two small holes, Eq. (103) 
immediately gives the Young interference pattern (90). Let me now use Eq. (103) to analyze the 
simplest (and most important) ID transparency profiles, leaving 2D cases for reader’s exercise. 

(i) A single slit of width a (Fig. 1 1) may be described by transparency 


r(p') 


f 1, for|x'|<a/2, 

[0, otherwise. 


(8.105) 


Its substitution into Eq. (103) yields 
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f kxa^ 
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, (8.106) 


naturally returning us to Eqs. (64) and (97), and hence to the red lines in Fig. 8 for the wave intensity. 
(Please note again that Eq. (103) describes only the Fraunhofer, but not the Fresnel diffraction!) 

(ii) Two narrow similar, parallel slits with a much larger distance a between them, may be 
described by taking 

T(p') oc 8{x' -a! 2) + 8{x ' + a / 2) , (8. 107) 


so that Eq. (103) yields the generic interference pattern, 
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whose intensity is shown with the blue line in Fig. 8. 


(iii) In a more realistic Young-type two-slit experiment, each slit has width (say, w) which is 
much larger than light wavelength X, but still much smaller than slit spacing a. This situation may be 
described by the following transparency function 


v-, fl, for|x'±a/2|< w/2, 

r(p') = Y i 1 1 

+ 0, otherwise, 


(8.109) 


for which Eq. (103) yields a natural combination of results (106) (with a replaced with w) and (108): 


/ (r) oc sine 
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kxa 
2 z 


( 8 . 110 ) 


This is the usual interference pattern modulated by a Fraunhofer-diffraction envelope (shown with the 
dashed blue line Fig. 15). Since function sine 2 £ decreases very fast beyond its first zeros at £= ±n, the 
practical number of observable interference fringes is close to 2a/ w. 



x/(2z / kw) 

Fig. 8.15. Young’s double-slit interference pattern for a finite slit width. 


(iv) A structure very useful for experimental and engineering practice is a set of many parallel 
slits, called the diffraction grating? 9 Indeed, if the slit width is much less than the grating period d, then 
the transparency function may be approximated as 

+oo 

T’(p') oc ^S(x'-nd), (8.111) 

n=— oo 

and Eq. (103) yields 

f . ,1 "+A f .nkxd) 

/( p) « 2^ ex Pr mK x d / = 2- ex P) _ 1 f • ( 8 - 1 12 ) 

This sum vanishes for all values of K x d that are not multiples of 2n, so that the result describes 
sharp intensity peaks at diffraction angles 


39 The rudimentary diffraction grating effect, produced by parallel fibers of bird feathers, was discovered as early 
as in 1673 by J. Gregory - who has also invented the reflecting (“Gregorian”) telescope. 
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(8.113) 


Taking into account that this result is only valid for small angles I a m \ « 1, it may be interpreted 
exactly as Eq. (59) - see Fig. 6a. However, in contrast with the interference (108) from two slits, the 
destructive interference from many slits kills the net wave as soon as the angle is even slightly different 
from each Bragg angle (60). This is very convenient for spectroscopic purposes, because the diffraction 
lines produced by multi-frequency waves do not overlap even if the frequencies of their adjacent 
components are very close. 


Two features of practical diffraction gratings make their properties different from this simple 
picture. First, the finite number N of slits, which may be described by limiting sum (109) to interval n = 
[-N/2, +N/2 ], results in the finite spread, Sal a ~ 1 IN, of each diffraction peak, and hence in the reduction 
of grating’s spectral resolution. (Unintentional variations of the inter-slit distance d have a similar effect, 
so that before the advent of high-resolution photolithography, special high-precision mechanical tools 
have been used for grating fabrication.) 

Second, the finite slit width w leads to the diffraction peak pattern modulation by a sine (kwa/2) 
envelope, similarly to pattern shown in Fig. 15. Actually, for spectroscopic purposes such modulation is 
a plus, because only one diffraction peak (say, with m = ±1) is practically used, and if the frequency 
spectrum of the analyzed wave is very broad (cover more than one octave), the higher peaks produce 
undesirable hindrance. Because of this reason, w is frequently selected to be equal exactly to d/2, thus 
suppressing each other diffraction maximum. Moreover, sometimes semi-transparent films are used to 
make the transparency function T(r ’) continuous and close to the sinusoidal one: 


T(p')~T 0 +2; cos 


2 nx' 7j 

d " 2 
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(8.114) 


Plugging the last expression into Eq. (103) and integrating, we see that the output wave consists of just 3 
components: the direct-passing wave (proportional to T 0 ) and two diffracted waves (proportional to 7j) 
propagating in the directions of the two lowest Bragg angles, a±\ = ±/Jd. 

Relation (103) may be also readily used to obtain one more general (and rather curious) result 
called the Babinet principle. Consider two experiments with diffraction of similar plane waves on two 
“complementary” screens who together would cover the whole plane, without a hole or an overlap. 
(Think, for example, about an opaque disk of radius R and a large opaque screen with a round orifice of 
the same radius.) Then, according to the Babinet principle, the diffracted wave patterns produced by 
these two screens in all directions with a ^ 0 are identical. The proof of this principle is straightforward: 
since the transparency functions produced by the screens are complementary in the following sense: 

r(p')-r 1 (p') + r 2 (p') = i, (8.115) 

and (in the Fraunhofer approximation (103) only!) the diffracted wave is a linear Fourier transfonn of 
T{ p ’), we get 

/i(P) + / 2 (P) = /o(P), (8-116) 


where f) is the wave “scattered” by the composite screen with 7o(p ’) = 1, he. the unperturbed initial 
wave propagating in the initial direction ( a = 0). In all other directions, f\ = -fi, i.e. the diffracted waves 


Chapter 9 


Page 29 of 36 





Essential Graduate Physics 


EM: Classical Electrodynamics 


are indeed similar besides the difference in sign - which is equivalent to a phase shift by ±n. However, it 
is important to remember that the Babinet principle notwithstanding, in real experiments the diffracted 
waves may interfere with the unperturbed plane wave fo(p), leading to different diffraction pattern in 
cases 1 and 2 - see, e.g., Fig. 13 and its discussion. 


8.9. Magnetic dipole and electric quadrupole radiation 

Throughout this chapter, we have seen how many important results may be obtained from Eq. 
(26) for the electric dipole radiation by a small-size source (Fig. 1). Only in rare cases when such 
radiation is absent, for example if the dipole moment p of the source equals zero (or does not change at 
time - either at all, or at the frequency of our interest), higher-order effects may be important. I will 
discuss the main two of them, the quadrupole electric and dipole magnetic radiation - mostly for 
reference purposes, because we would not have much time to discuss their applications. 


In Sec. 2 above, the electric dipole radiation was calculated by plugging the first, leading tenn of 
expansion (19) into the exact fonnula (17b) for the retarded vector-potential A(r, t). Let us make a more 
exact calculation, by keeping the second tenn of that expansion as well: 
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(8.117) 


Since the expansion is only valid if the last term in the second argument is relatively small, in the Taylor 
expansion of j with respect to that argument we may keep just the first two leading tenns: 
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(8.118) 


so that Eq. (17b) yields A = A e + A’, where A e is the electric dipole contribution as given by Eq. (23), 
and A ’ is the new term of the next order in small parameter r ’ « r: 

A'(r ,t) = —^ L — — f j(r',FX r '- n V V- (8.119) 

V ’ 4nrv dt' JV A r 


Just as was done in Sec. 2, let us evaluate this term for a system of nonrelativistic particles with 
electric charges qk and radius-vectors r/ c (f): 


A'(r,f) = 




4 n rv 


ut k 


(8.120) 


J t=t' 


Using the “bac minus cab” identity of the vector algebra again, 40 Eq. (120) may be rewritten as 

( 8 . 121 ) 

so that the right-hand part of Eq. (120) may be presented as a sum of two terms, A ’ = A„, + A q , where 


U (*V n )=^ U (r* • n ) + 1 r k (n • r k ) = | (r t x r k ) x n + 1 r, (n • r k ) + 1 r k (n • r k ) 

= — (r t xr t )xn + -— [r t (n-rj], 

2 2 dt 


40 If you need, see, e.g., MA Eq. (7.5). 
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= m(O x n = 




4 nrv 


4 nrv 


m 


t-- 


v 


vj 


n, with m(f)s-£r t (f)x0 t r t (f), (8.122) 


A ? M 


a 

8 nrv 


<P_ 

dt 2 


Zw( n - r t 



(8.123) 


Comparing the second of Eqs. (122) with Eq. (5.91), we see that m is just the magnetic moment 
of the source. On the other hand, the first of Eqs. (122) is absolutely similar in structure to Eq. (23), with 
p replaced by (mxn)/v, so that for the corresponding component of the magnetic field it gives (in the 
same approximation r » X) the result similar to Eq. (24): 

Magnetic 
dipole 
radiation 
field 

According to this expression, just as at the electric dipole radiation, vector B is perpendicular to vector 
n,, and its magnitude is also proportional to the sin 0, where 9 is now the angle between the direction 
toward the observation point and the second time derivative of vector m rather than p: 


B »( r >0 = 


9 

4 n rv 


-Vx 



f r ) 


111 

t-- 

xn 


< vj 



A 


471 rv 2 


nx 


m 


r r^ 
t-- 
V vj 


x n 


(8.124) 


B.„ = 


M 

4 nrv 2 


- m 


r r \ 
t-- 
V vj 


sin 6 . 


(8.125) 


As the result, the intensity of this magnetic dipole radiation has the similar angular distribution: 

Magnetic 
dipole 
radiation 
power 
density 

- cf. Eq. (26). Note, however, that this radiation is usually much weaker than its electric counterpart. For 
example, for a nonrelativistic particle with electric charge q, moving on a trajectory with of size ~a, the 
electric dipole moment is of the order of qa, while its magnetic moment scales as qa co, where co is the 
motion frequency. As a result, the ratio of the magnetic and electric dipole radiation intensities is of the 
order of ( adv ) , i.e. the squared ratio of particle’s speed to the speed of emitted waves - that has to be 
much smaller than 1 for our nonrelativistic estimate to be valid. 



(8.126) 


The angular distribution of the electric quadrupole radiation, described by Eq. (123), is more 
complicated. In order to show this, we may add to A q a vector parallel to n (i.e. along the wave 
propagation), getting 


A «( r ’0^ 


24 nrv 


Q 


t-- 


V) 


where Q = J^q k {3r, (n • r A ) - nr 2 


(8.127) 


because this addition does not give any contribution to the transverse component of the electric and 
magnetic fields, i.e. to the radiated wave. According to the above definition of vector Q, its Cartesian 
components may be presented as 41 


41 In electrostatics, the symmetric, zero-trace tensor Q determines the next term in the potential expansion (3.5): 


<p{r) = 


1 


4ns,, 


+ +^J ’LVrQi- +••• 


j,f = 1 
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Qj =tQ* n r ’ 

/= 1 

where Qjj are elements of the so-called electric quadrupole tensor #of the system: 


42 




(8.128) 


(8.129a) 


For clarity, let me spell out the tensor in its matrix form: 

^2x 2 -y 2 -z 2 3 xy 3xz 

@ = Y j q k 3 xy 2 y 2 -x 2 -z 2 3yz 

3xz 3yz 2 z 2 -x 2 -y 2 


(8.129b) 


J k 


Differentiating the first of Eqs. (127) at r » A, we get 

Electric 

,o , quadrupole 
radiation 
field 

Superficially, this expression is similar to Eqs. (24) or (124), but according to Eqs. (127) and (129), 
components of vector Q depend on the direction of vector n, leading to a different angular dependence 
of S r . 

As the simplest example, let us consider a system of two equal point electric charges moving at 
equal distances d(t) « 2 from a stationary center (Fig. 16). 




Fig. 8.16. The simplest system emitting electric 
quadrupole radiation. 


Due to the symmetry of the system, its dipole moments p and m (and hence its electric and 
magnetic dipole radiation) vanish, but the quadrupole tensor (129) still has nonvanishing components. 
With the coordinate choice shown in Fig. 16, these components are diagonal: 

Q xx =Q yy =-2qd 2 , Q zz = 4qd 2 . (8.131) 

With axis x in the plane of the direction n toward the source (Fig. 16), so that n x = sin#, n y = 0, n : = cos 
#, Eq. (128) yields 

Q x = -2qd 2 sin#, Q y =0, Q z = 4qd 2 cos # , (8.132) 


42 As a math reminder, tensor is a matrix describing a physical reality independent of the reference frame choice, 
so that the Cartesian elements of the tensor have to change according to certain geometric rules if the reference 
frame is changed - e.g., rotated. This notion is very similar to a physical vector, that may be described by an 
ordered set of its Cartesian components, which change according to certain rules as the result of the reference 
frame’ change. We may be confident that a matrix represents a tensor if it provides a linear relation between 
components of two physical vectors - such a Q and n in Eq. (128). 
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so that the vector product in Eq. (130) has only one nonvanishing Cartesian component: 

(nxQ) v = n,Q x -n x Q_ = -6gsin#cos6!-^y [j 2 (i)]. (8.133) 

As a result, the radiation intensity is proportional to sin 2 6fcos 2 #, i.e. vanishes not only along the 
symmetry axis (as the dipole radiation does), but also in all directions perpendicular to this axis, 
reaching its maximum at 6 = rdA. 

For more complex systems, the angular distribution of the electric quadrupole radiation may be 
different, but its total power may be always presented in a simple form 


(8.134) 


Let me finish this section by giving, without proof, one more fact important for applications: due 
to their different spatial structure, the magnetic dipole and electric quadrupole radiation fields do not 
interfere, i.e. the total power of radiation (neglecting higher multipole terms) may be found as the sum 
of these components, calculated independently. 


Electric 

quadrupole 

radiation 

power 



8.10. Exercise problems 

8.1 . In the electric dipole approximation, calculate the angular distribution and total power of 
electromagnetic radiation by the following classical model of the hydrogen atom: an electron rotating, at 
a constant distance r, about a much heavier proton. Use the latter result to evaluate the classical lifetime 
of the atom, borrowing the initial value of R from quantum mechanics: R( 0) = r B ~ 0.53x 10' 10 m. 

8.2 . A nonrelativistic particle of mass m with the electric charge q is placed into a uniform 
magnetic field B. Derive the law of decrease of particle’s kinetic energy due to its electromagnetic 
radiation at the cyclotron frequency oo c = qB/m. Evaluate the rate of such radiation cooling for electrons 
in a magnetic field of 1 T, and estimate the electron energy interval in which this result is qualitatively 
correct. 

Hint: The cyclotron motion will be discussed in detail (for arbitrary particle velocities v ~ c) in 
Sec. 9.6 below, but I hope that the reader kn ows that in the nonrelativistic case (v « c) the above 
formula for co c may be readily obtained by combining the 2 nd Newton law mvf/R = qv\B for the 
circular motion of the particle under the effect of the magnetic component of the Lorentz force (5.10), 
and the geometric relation v± = Rco c . (Here v± is particle’s velocity within the plane normal to vector B.) 

8.3 . Solve the dipole antenna radiation problem discussed in Sec. 2 (see Fig. 3) for the optimal 
length / = AJ2, assuming 43 that the current distribution in each of its arms is sinusoidal: 

/(z, t ) = I 0 cos — cos cot . 


43 As was emphasized in Sec. 2, this is a reasonable guess rather than a controllable approximation. The exact 
(rather involved!) theory shows that this assumption gives errors ~5%. 
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8.4 . Use the harmonic oscillator model of a bound charge, given by Eq. (7.30), to explore the 
transition between two scattering limits discussed in Sec. 3, in particular the resonant scattering taking 
place at <x>~ cqq. In this context, discuss the contribution of scattering into oscillator’s damping. 

8.5 . * A sphere of radius R , made of a material with constant permanent electric polarization Pq 
and mass density p, is free to rotate about its center of mass. Calculate the total cross-section of 
scattering, by the sphere, of a linearly polarized electromagnetic wave of frequency co « R/c, 
propagating in free space, in the limit of small wave amplitude, assuming that the initial orientation of 
the polarization vector Po is random. 


8.6 . Use the Born approximation to analyze the interference pattern produced 
by plane wave’s scattering on a set of N similar, equidistant points on a straight line k () ** 

nonnal to the direction of the incident wave’s propagation - see Fig. on the right. * d 
Discuss the trend(s) of the pattern in the limit N —> qo. « 


>N 


8.7 . Use the Born approximation to calculate the differential cross-section of plane wave 
scattering by a dielectric cube of side a, with s ~ so. In the limits ka « 1 and ka » 1 (where k is the 
wave vector), analyze the angular dependence of the differential cross-section. Calculate the full cross- 
section for the simplest case when the incident wave vector is parallel to one of cube’s sides. 


8.8 . Use the Born approximation to calculate the differential cross-section of plane wave 
scattering by a nonmagnetic, uniform dielectric sphere with e « e (u of an arbitrary radius R. In the limits 
kR « 1 and 1 « kR (where k is the wave number), analyze the angular dependence of the differential 
cross-section, and calculate the full cross-section. 


8.9 . A sphere of radius R is made of a uniform, nonmagnetic, linear dielectric material. Calculate 
its full cross-section of scattering of a low-frequency monochromatic wave, with k « I /R, for an 
arbitrary dielectric constant, and compare the result with the solution of the previous problem. 

8.10 . Solve the previous problem, also in the low-frequency limit kR « 1, for the case when the 
sphere’s material has a frequency-independent Ohmic conductivity, and ^opt = so, and a relatively large 
skin depth (S s » R), and compare the results. 

8.1 1 . Use the Born approximation to calculate the differential cross-section of plane wave 
scattering on a right, circular cylinder of length / and radius R , for arbitrary incidence. 

8.12 . Formulate the quantitative condition of the Born approximation validity for a unifonn 
linear-dielectric scatterer with all linear dimensions of the order of a. 


8.13 . Use the Huygens principle to calculate wave’s intensity on the symmetry plane of the slit 

9 

diffraction experiment (i.e. atx = 0 in Fig. 1 1), for arbitrary ratio z/ka . 
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8.14 . A plane wave with wavelength A is normally 
incident on an opaque, plane screen, with a round orifice of 
radius R » A. Use the Huygens principle to calculate 
passed wave’s intensity distribution along system’s 
symmetry axis, at distances z» R from the screen (see Fig. 
on the right), and analyze the result. 





So 

/ 

—A 

2R f 




0 

\ 

z 
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8.15. A plane monochromatic wave is normally 
incident on an opaque circular disk of radius R » A . 



die 

S 0 

^ 

>k 

/ 

/ 

r 

ii 

T? 

OH 

Use the Huygens principle to calculate wave’s intensity 
at distance z» R behind the disk center (see Fig. on the 
right). Discuss the result. 



1 w 

0 

\ 

7^ 

z 

f 


8.16 . Use the Huygens principle to analyze the Fraunhofer diffraction of a plane wave normally 
incident on a square-shape hole, of size axa, in an opaque screen. Sketch the diffraction pattern you 
would observe at a sufficiently large distance, and quantify expression “sufficiently large” for this case. 


8.17 . Within the Fraunhofer approximation, 
analyze the pattern produced by a ID diffraction 
grating with the periodic transparency profile shown 
in Fig. on the right, for the nonnal incidence of a 
plane, monochromatic wave. 



8.18 . N equal point charges are attached, at equal intervals, to a circle 
rotating with a constant angular velocity about its center - see Fig. on the right. 

For what values of N does the system emit: 

(i) the electric dipole radiation? 

(ii) the magnetic dipole radiation? 

(iii) the electric quadrupole radiation? 

8.19. The orientation of a magnetic dipole m, of a fixed magnitude, is rotating about a certain 
axis with angular velocity co, with angle 6 between them staying constant. Calculate the angular 
distribution and the average power of its radiation (into free space). 

8.20 . Complete the solution of the problem started in Sec. 9, by calculating the full power of 
radiation of the system of two charges oscillating in antiphase along the same straight line - see Fig. 6. 
Also, calculate the average radiation power for the case of harmonic oscillations, d(t) = acoscot, compare 
it with the case of a single charge perfonning similar oscillations, and interpret the difference. 
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Chapter 9. Special Relativity 

This chapter starts with a brief review of the special relativity’s basics. This background is used, later in 
the chapter, for the analysis of the relation between electromagnetic field values measured in different 
reference frames moving relative to each other, and discussions of relativistic particle dynamics in the 
electric and magnetic fields, and of analytical mechanics of electromagnetism. 


9.1. Einstein postulates and the Lorentz transform 

As was emphasized at the derivation of expressions for the dipole and quadrupole radiation in 
the last chapter, they are only valid for systems of nonrelativistic particles. Thus, these results cannot be 
used for description of such important phenomena as the Cherenkov radiation or synchrotron radiation, 
in which relativistic effects are essential. Moreover, analysis of motion of charged relativistic particles 
in electric and magnetic fields is also a natural part of electrodynamics. This is why I will follow the 
tradition of using this course for a (by necessity, brief) introduction to special relativity theory. This 
theory is based on the idea that measurements of all physical variables (including spatial and even 
temporal intervals between two events) may give different results in different reference frames, in 
particular two frames moving relative to each other translationally (i.e. without rotation), with a certain 
constant velocity v (Fig. 1). 

| r = {x,v,zj 
* |r' = {x',y',z'} 

v 

— > p Fig. 9.1. Translational, uniform motion 

x of two reference frames. 



In the nonrelativistic (Newtonian) mechanics the problem of transfer between such reference 
frames has a simple solution at least in the limit v « c, because the basic equation of particle dynamics 
(the 2 nd Newton law) 1 

m k*k =-V A -X c/ (^ -iV), (9-1) 


where U, the potential energy of inter-particle interactions, is invariant with respect to the so-called 
Galilean transform (or “transformation”). 2 Choosing the coordinate axes of both frames so that axes x 
and x ’ are parallel to vector v (Fig. 1), the transform 3 may be presented as 


Galilean 

transform 


x = x' + vt', y = y', Z = z', t = t', 


(9.2a) 


1 Let me hope that the reader does not need a reminder that in order for Eq. (1) to be valid, the reference frames 0 
and 0 ’ have to be inertial - see, e.g., CM Sec. 1.3. 

2 It had been first formulated by G. Galilei as early as in 1638 - four years before I. Newton was born ! 

3 Note a very unfortunate term “boost”, used sometimes for the transform between the reference frames. (It is 
especially unnatural in the special relativity, not describing any accelerations.) In these notes, this term is avoided. 
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and plugging Eq. (2a) into Eq. (1), we get an absolutely similarly looking equation of motion in the 
“moving” reference frame 0 ’. Since the reciprocal transform, 

x' = x-vt, y = y', z' = z, t' = t, (9.2b) 


is similar to the direct one, with the replacement of (+v) with (-v), we may say that the Galilean 
invariance means that there is no any “master” {absolute) spatial reference frame in classical mechanics, 
although the spatial and temporal intervals between different events are absolute (reference-frame 
invariant). 


However, it is straightforward to use Eq. (2) to check that the form of the wave equation 


' 3 2 d 2 d 2 

K dx 2 dy 2 dz 2 


I g 2 " 
c 2 8t 2 J 


f = 0, 


(9.3) 


describing in particular the electromagnetic wave propagation in free space, 4 is not Galilean-invariant. 5 
For the “usual” (say, elastic) waves, which obey a similar equation albeit with a different speed, 6 this 
lack of Galilean invariance is natural and is compatible with the invariance of Eq. (1), from which the 
wave equation originates. This is because the elastic waves are essentially the oscillations of interacting 
particles of a certain medium (e.g., an elastic solid), which makes the reference frame connected to this 
medium, special. So, if the electromagnetic waves were oscillations of a certain special medium (that 
was first called the “luminiferous aether” 7 and later just ether), similar arguments might be applicable to 
reconcile Eqs. (2) and (3). 

The detection of such a medium was the goal of the Michelson-Morley measurements (carried 
out between 1881 and 1887 with better and better precision), that are sometimes called “the most 
famous failed experiment in physics”. Figure 2 shows a crude scheme of their experiments. 


mirror 



mirror 



4 Discussions in this chapter and most of the next chapter will be restricted to the ffee-space (and hence 
dispersion-free) case; some media effects on radiation by relativistic particles will be discussed in Sec. 10.4. 

5 It is interesting that the Schrodinger equation, whose fundamental solution for a free particle is a similar 
monochromatic wave (albeit with a different dispersion law), is Galilean-invariant, with a certain addition to the 
wavefunction’s phase - see, e.g., QM Chapter 1. 

6 See, e.g., CM Secs. 5.5 and 7.7. 

7 In the ancient Greek mythology, aether is the clear upper air breathed by the gods residing on mount Olympus. 
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A nearly-monochromatic wave is split in two parts (nominally, of equal intensity), using a semi- 
transparent mirror tilted by 45° to the incident wave direction. These two partial waves are reflected 
back by two genuine mirrors, and arrive at the same semi-transparent mirror again. Here a half of each 
wave is returned to the light source area (where they vanish without affecting the source), but another 
half passes toward the detector, forming, with its counterpart, an interference pattern similar to that in 
the Young experiment. Thus each of the interfering waves has traveled twice (back and forth) each of 
two mutually perpendicular “arms” of the interferometer. Assuming that the ether, in which light 
propagates with speed c, moves with speed v< c along one of the anns, of length //, it is straightforward 
(and hence left for reader’s exercise :-) to get the following expression for the difference between light 
roundtrip times: 




(l-v 2 /c 2 ) 1/2 


l-v 2 /c 2 


Ifv) 

c{cj 


(9.4) 


where \ t is the length of the second, “transverse” arm of the interferometer (perpendicular to v), and the 
last, approximate expression is valid at l , « // and v « c. 

Since Earth moves around the Sun with speed ve ~ 30 km/s « 10" 4 c, the arm positions relative to 
this motion alternate, due to Earth rotation about its axis, each 6 hours - see the right panel of Fig. 2. 
Hence if we assume that the ether rests in Sun’s reference frame, At (and the corresponding shift of 
interference fringes), has to alternate with this half-period as well. The same alternation may be 
achieved, at a smaller time scale, by a deliberate rotation of the instrument by tr/2. In the most precise 
version of the Michelson-Morley experiment (1887), this shift was expected to be close to 0.4 of the 
fringe pattern period. The result was negative, with the error bar about 0.01 of the fringe period. 8 

The most prominent immediate explanation of this zero result 9 was suggested in 1889 by G. 
FitzGerald and (independently and more qualitatively) by H. Lorentz in 1892: as evident from Eq. (4), if 
the longitudinal arm of the interferometer itself experiences the so-called length contraction. 


r 

l l (v) = l,(0) 1 

v 


2 y /2 

V 



(9.5) 


while the transverse arm’s length is not affected by the motion relative to the ether, this kills At. This, 
extremely radical, idea received a strong support from the proof, in 1887-1905, that the Maxwell 
equations, and hence the wave equation (3), are form-invariant under the so-called Lorentz transform , 10 
For the choice of coordinates shown in Fig. 1, the transform reads 


8 Through the 20 th century, the Michelson-Morley-type experiments were repeated using more and more refined 
experimental techniques, always with the zero result for the apparent ether motion speed. For example, recent 
experiments, using cryogenically cooled optical resonators, have reduced the upper limit for such speed to just 
3x10‘ 15 c -see H. Muller et al., Phys. Rev. Lett. 91 , 020401 (2003). 

9 The zero result of a slightly later experiment, namely precise measurements of the torque which should be 
exerted by the moving ether on a charged capacitor, carried out in 1903 by F. Trouton and H. Noble (following G. 
FitzGerald’s suggestion), seconded the Michelson and Morley’s conclusions. 

10 The theoretical work toward this goal (which I do not have time to review in detail) included important 
contributions by W. Voigt (in 1887), H. Lorentz (1892 - 1904), J. Larmor (1897 and 1900), and H. Poincare 
(1900 and 1905). 
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x = 


x' + vt' 
(l -v 2 /c 2 ) 


, y = y’, z = z', t = 


t' + (v/c 2 )x' 

(TvTTr ' 


(9.6a) 


It is elementary to solve these equations for the primed coordinates to get the reciprocal transform 


x-vt 


x = 


(l -v 2 /c 2 f 2 


y' = y, z' = z, t' = 


t - (v/c 2 )x 


(l-v 2 /c 2 )’ 


1/2 ‘ 


(9.6b) 


Lorentz 

transform 


(I will soon present Eqs. (6) in a more elegant form.) 

The Lorentz transform relations (6) are evidently reduced to the Galilean transform formulas (2) 
at v~ « c . As will be proved in the next section, Eqs. (6) also yield the Lorentz length contraction (5). 
However, all attempts to give a reasonable interpretation of these equations while keeping the notion of 
the ether have failed, in particular because of the restrictions imposed by results of earlier experiments 
carried out in 1851 and 1853 by H. Fizeau - that were repeated with higher accuracy by the same 
Michelson and Morley in 1886. These experiments have shown that if one sticks to the ether concept, 
this hypothetical medium should be partially “dragged” by any moving dielectric media with a speed 
proportional to ( s, - 1). Careful reasoning shows that such local drag is irreconcilable with the assumed 
continuity of the ether. 

In his famous 1905 paper, Albert Einstein has suggested a bold resolution of this contradiction, 
essentially removing the concept of the ether altogether. Moreover, he argued that the Lorentz transform 
is the general property of time and space, rather than of the electromagnetic field alone. He has started 
with two postulates, the first one essentially repeating the principle of relativity, fonnulated earlier 
(1904) by H. Poincare in the following form: 

“...the laws of physical phenomena should be the same, whether for an observer fixed, or for an 
observer carried along in a uniform movement of translation; so that we have not and could not have 
any means of discerning whether or not we are carried along in such a motion." 11 

The second Einstein’s postulate was that the speed of light c, in free space, should be constant in 
all reference frames. (This is essentially a denial of ether’s existence.) 

Then, Einstein showed how naturally do the Lorenz transform relations (6) follow from his 
postulates, with a few (very natural) additional assumptions. Let a point source emit a short flash of 
light, at the moment t = t’ = 0 when origins of the reference frames shown in Fig. 1 coincide. Then, 
according to the second of Einstein’s postulates, in each of the frames the spherical wave propagates 
with the same speed c, i.e. coordinates of points of its front, measured in the two frames, have to obey 
equations 

(ct) 2 -(x 2 + v 2 + z 2 ) = 0, 

' ’ (9.7) 

(ct') 2 -(x ,2 + y ,2 +z ,2 ) = 0. 


What may be the general relation between the combinations in the left-hand side of these equations - not 
for this selected pair of events, the light flash and its detection, but in general? A very natural 
(essentially, the only justifiable) choice is 


11 Note that though the relativity principle excludes the notion of the special (“absolute”) spatial reference frame, 
its verbal formulation still leaves the possibility of the Galilean “absolute time” open. The quantitative relativity 
theory kills this option - see Eqs. (6) and their discussion below. 
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[(ct) 2 -{x 2 + v 2 +z 2 )]=/(v 2 )[(c0 2 -(X ' 2 + v' 2 +2' 2 )]. (9.8) 

Now, according to the first postulate, the same relation should be valid if we swap the reference frames 
(x <-» x’, etc.) 12 and replace v with (-v). This is only possible if / 2 = 1, so that excluding option /= -1 
(which is incompatible with the Galilean transform in the limit vie — > 0), we get 

{ctf - (x 2 + y 2 +z 2 ) = (ct ') 2 - (x r2 + v' 2 + z' 2 ) . (9.9) 

For the line y = y ’ = 0, z = z ’ = 0, Eq. (9) is reduced to 

(ctf-x 2 ={ct'f-x' 2 . (9.10) 

It is very illuminating to interpret this relation as the one resulting from a mutual rotation of the 
reference frames (that now have to include clocks to measure time) on the plane of coordinate x and the 
so-called Euclidian time z = ict - see Fig. 3. 



Fig. 9.3. The Lorentz transform as a mutual 
rotation of reference frames on the [x, r] plane. 


Indeed, rewriting Eq. (10) as 


z 2 + x 2 = z r + x' 2 , (9.11) 

we may consider it as the invariance of the squared radius at the rotation that is shown in Fig. 3 and 
described by the evident geometric relations 

x = x'cos y/ - z’smy/, 
z = x'sin^ + r'cos y/, 

with the reciprocal relations 

x' = xcos^ + zsmy/, 
d = -xsin y/ + r cos y/. 


(9.12a) 


(9.12b) 


So far, angle i// has been arbitrary. In the spirit of Eq. (8), a natural choice is t// = yAy), with the 
requirement i/X0) = 0. In order to find this function, let us write the definition of velocity v of frame 0 ’, 
as measured in reference frame 0: for x ’ = 0, x = vt. In variables x and r, this means 


x 

T 


x'=0 


X 

ict 


x'=0 


V 

ic 


(9.13) 


On the other hand, for the same point x’ = 0, Eqs. (12a) yield 


12 Strictly speaking, at this swap we should also replace v with (-v), but this change does not affect Eq. (8). 
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-|*'=o = - tan y/ . 
x 

These two expressions are compatible only if 

iv 

tan y/ = — , 
c 

so that 


(9.14) 


(9.15) 


sin y/ = 


iamp 


iv l c 


= Wy, 


(l + tan 2 ¥ T “ (l - v 2 IS f 2 ~ "" ’ C0S ' W ' (l + tan 2 V )" 2 ' (l - v 2 / C 2 )' 

where ft and /are two very convenient and commonly used dimensionless parameters defined as 


y/2 


= Y, (9-16) 



(9.17) 


(Vector p is called the normalized velocity, while scalar y, the Lorentz factor 

Using the relations for \// , Eqs. (12) become 

x = y{x'-ij3 r'), x = y(ij3x' + d), (9.18a) 

x' = y(x + ij3x), d = y(-ij3x + r). (9.18b) 

Now returning to the real variables [x, ct], we get the Lorentz transform relations (6) in a more compact 
form: 14 


Parameters 
p and y 


x = y(x' + f ct'), y = y', z = z', ct = y{ct' + J3 x'), (9.19a) 

x' = y(x-j3 ct), y' = y, z' = z, ct' = y{ct - (3 x). (9.19b) 

An immediate corollary of Eqs. (6) is that for y to stay real, we need v < c , i.e. that the speed of 
any physical body (to which we could connect a reference frame) cannot exceed the speed of light, as 
measured in any physically meaningful reference frame. 15 


9.2. Relativistic kinematic effects 

In order to discuss other corollaries of Eqs. (19), we need to spend a few minutes to discuss what 
do these relations actually mean. Evidently, they are trying to tell us that the spatial and temporal 
intervals are not absolute (as they are in the Newtonian space), but do depend on the reference frame 
they are measured in. So, we have to understand very clearly what exactly may be measured - and thus 
may be discussed in a physics theory. Recognizing this necessity, A. Einstein has introduced the notion 
of numerous imaginary observers that may be distributed all over each reference frame. Each observer 
has a clock and may use it to measure the instants of local events. He also conjectured that: 


13 One more function of ft, the rapidity defined as /?= tanhi/9 (so that y/= if), is also useful for some calculations. 

14 Still, in some cases below, it will be more convenient to use Eqs. (6) rather than Eqs. (19). 

15 All attempts to rationally conjecture particles moving with v > c, called tachyons, have failed (so far, at least :-). 
Possibly the strongest objection against their existence is the notice that tachyons could be used to communicate 
back in time, thus violating the causality principle - see, e.g., G. Benford et al., Phys. Rev. D 2 , 263 (1970). 
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(i) all observers within the same reference frame may agree on a common length measure (“a 
scale”), be. on their relative positions in that frame, and synchronize their clocks, 16 and 

(ii) observers belonging to different reference frames may agree on the nomenclature of world 
events (e.g., short flashes of light) to which their respective measurements belong. 

Actually, these additional postulates have been already implied in our “derivation” of the 
Lorentz transform in Sec. 1. For example, by {x, y, z, and t} we mean the results of space and time 
measurements of a certain world event, about that all observers belonging to frame 0 agree. Similarly, 
all observers of frame 0 ’ have to agree about results {x\ y ’, z’, / ’}. Finally, when the origin of frame 0 ’ 
passes by some sequential points x/ £ of frame 0, observers in that frame may measure its passage times h 
without a fundamental error, and know that all these times belong to x ’ = 0. 

Now we can analyze the major corollaries of the Lorentz transform, which are rather striking 
from the point of view of our everyday (rather nonrelativistic :-) experience. 

(i) Length contraction . Let us consider a rigid rod, stretched along axis x, with length / = x 2 - xi, 
where xiy are the coordinates of rod’s ends, as measured in its rest frame 0, at any instant t (Fig. 4). 
What would be the rod’s length / ’ measured by the Einstein observers in the moving frame 0 ’? 



Fig. 9.4. Relativistic length contraction. 


At a time instant t’ agreed upon in advance, the observers who find themselves exactly at the 
rod’s ends, may register that fact, and then subtract their coordinates x’ 1,2 to calculate the apparent rod 
length /’ = X 2 ’ - xi ’ in the moving frame. According to Eq. (19a), / may be expressed via / ’ as 

/ = x 2 -x, = y(x 2 ' + /3ct') - y(x\ + pet') = y(x 2 '-x,') = //'>/'. (9.20a) 

Hence, the rod’s length, as measured in the moving reference frame is 


(9.20b) 


in accordance with the FitzGerald-Lorentz hypothesis (5). This is the relativistic length contraction 
effect: an object is always the longest (has the so-called proper length l) if measured in its rest frame. 
Note that according to Eq. (19), the length contraction takes place only in the direction of the relative 
motion of two reference frames. As has been noted in Sec. 1, this result immediately explains the zero 


Length 

contraction 



16 A posteriori, the Lorenz transform may be used to show that consensus-creating procedures (such as clock 
synchronization) are indeed possible. The basic idea of the proof is that at v « c the relativistic corrections to 
space and time intervals are of the order of (v/c) 2 , they have negligible effects on clocks being brought together 
into the same point for synchronization very slowly, with velocity v « c. The reader interested in detailed 
discussion of this and other fine points of special relativity may be referred to, e.g., either H. Arzelies, Relativistic 
Kinematics, Pergamon, 1966, or W. Rindler, Introduction to Special Relativity, 2 nd ed., Oxford U. Press, 1991. 
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result of the Michelson-Morley-type experiments, so that they give a convincing evidence (if not an 
irrefutable proof) of Eq. (20). 

(ii) Time dilation . Now let us use Eqs. (19a) to find the time interval At, as measured in frame 0, 
between two world events - say, two ticks of a clock moving with frame 0 ’ (Fig. 5), i.e. having constant 
values of x’,y ’, and z ’. 



Fig. 9.5. Relativistic time dilation. 


Let the time interval between these two events, measured in clock’s rest frame O’, be At ’ = ti‘ — 
t\ At these two moments, the clock would fly by certain two Einstein’s observers at rest in frame 0, so 
that they can record the corresponding moments t\,2 shown by their clocks, and then calculate At as their 
difference. According to the second of Eqs. (19a), 


At = t 2 -t x = — [(cf 2 ' + — = /At ' , 

c 


(9.21a) 


so that, finally, 


At = /At' = 



> At' . 


(9.21b) 


Length 

contraction 


This is the famous relativistic time dilation (or “dilatation”) effect: a time interval is longer if measured 
in a frame (in our case, frame 0) moving relatively to the clock, while that in the rest frame is the 
shortest - the so-called proper time interval. 


This counter-intuitive effect is the everyday reality at experiments with high-energy elementary 
particles. For example, in a typical (by no means record-breaking) experiment carried out in Fermilab, a 
beam of charged 200 GeV pions with /& 1,400 passed distance / = 300 m distance with the measured 
loss of only 3% of the initial beam intensity due to the pion decay (mostly, into muon-neutrino pairs) 
with proper lifetime to ~ 2.56xl0‘ 8 s. Without the time dilation, only an exp{-//cto}~10" 17 part of the 
initial pions would survive, while the relativity-corrected number exp{-//cT} = exp{-//c^ 0 } ~ 0.97 was in 
a full accordance with experimental measurements. As another example, the global positioning system 
(GPS) is designed with the account of the time dilation due to the velocity of its satellites (and also some 
gravity-induced, i.e. general-relativity corrections that I do not have time to discuss) and would give 
large errors without such corrections. So, there is no doubt that time dilation (21) is a reality, though the 
precision of its experimental tests I am aware of has been limited by a few percent, because of almost 
unavoidable involvement of gravity effects. 17 


Before the first reliable observation of the time dilation (by B. Rossi and D. Hall in 1940), there 
had been serious doubts in the reality of this effect, the most famous being the twin paradox first posed 


17 See, e.g., J. Flafele and R. Keating, Science 177, 166 (1972). 
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(together with an immediate suggestion of its resolution) by P. Langevin in 1911. Let us send one of two 
twins on a long star journey with a speed v approaching c. Upon his return to Earth, who of the twins 
would be older? The naive approach is to say that due to the relativity principle, not one can be (and 
hence there is no time dilation), because each twin could claim that his counterpart, rather than himself, 
was moving, with the same speed v, just in the opposite direction. The resolution of the paradox in the 
general theory of relativity (which can handle gravity and acceleration effects) is that one of the twins 
had to be accelerated to be brought back, and hence the reference frames have to be dissimilar: only one 
of them may stay inertial all the time. Because of that, the twin who had been accelerated (“actually 
traveling”) would be younger than his sibling when they meet. 

(iii) Velocity transformation. Now let us calculate velocity u of a particle, as observed in 
reference frame 0, provided that its velocity, as measured in frame 0 ’, is u ’ (Fig. 6). 


A 


y 



aT 



u 

u' 



X 


■>. 


Fig. 9.6. Relativistic velocity addition. 


Keeping the usual definition of velocity, but with due attention to the relativity of not only 
spatial but also temporal intervals, we may write 


dr . dr' 
u = — , u = — . 
dt dt' 

Plugging in the differentials of the Lorentz transform relations (6a), we get 


_dx _ 


dx' + vdt ' 


u' +v 


dt dt' + vdx' / c 1 + u'v/c 


u = 
y dt 


dy 1 


dy' 


1 


y dt' + vdx' / c yl + u' x v/c~ 


(9.21) 


(9.22) 


and the similar formula for u z . In the classical limit v/c — > 0, these relations are reduced to 

U x =U 'x +V , U y =U 'y’ u,=u'_, (9.23) 

and may be merged into the familiar Galilean vector form 

u = u' + v, for v « c . (9.24) 


Longitudinal 

velocity 

addition 


In order to see how strange the full relativistic rules (22) are, let us first consider a purely 
longitudinal motion, u y = u z = 0; then 18 


u' + v 

u = 

1 + u'v/ c~ 


(9.25) 


18 With an account of the well-known trigonometric identity tan(a + b) = (tana + tanb)/( I - tana tan b) and Eq. 
(15), Eq. (25) shows that that rapidities y/ add up exactly as longitudinal velocities at nonrelativistic motion, 
making that notion very convenient for the analysis transfer between several frames. 
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where a = u x and u’ = u ’ x . Figure 7 shows a as the function of u ’, given by this formula, for several 
values of the reference frames’ relative velocity v. 



The first sanity check is that if v = 0, i.e. the reference frames are at rest relative to each other, 
then u = u as it should be - see the diagonal straight line. Next, if magnitudes of it ’ and v are both 
below c, so is the magnitude of u . (Also good, because otherwise ordinary particles in one frame would 
be tachyons in the other one, and the theory would be in a big trouble.) Now strange things start: even as 
u ’ and v are both approaching c, then u is also close to c, but does not exceed it. As an example, if we 
fired ahead a bullet with speed 0.9c from a spaceship moving from the Earth also at 0.9c, Eq. (25) 
predicts the speed of the bullet relative to Earth to be just [(0.9 + 0.9)/(l + 0.9x0.9)]c « 0.994c < c, 
rather than (0.9 + 0.9)c = 1.8 c > c as in the Galilean kinematics. We certainly should accept this 
strangeness of relativity, because it is necessary to fulfill the 2 nd Einstein’s postulate: the independence 
of the speed of light in any reference frame. Indeed, for u ’ = ±c, Eq. (25) yields u = ±c, regardless of v. 

In the opposite case of transverse motion, when a particle moves across the relative motion of 
the frames (for example, at our choice of coordinates, u’ x = u’ z = 0), Eqs. (22) yield a less spectacular 
result 


«,= — <u' y . (9.26) 

Y 

This effect comes purely from the time dilation, because the transverse coordinates are Lorentz- 
invariant. 

In the case when both u x ’ and u y ’ are substantial (but u z ’ is still zero), we may divide expressions 
(22) by each other to relate angles #of particle propagation, as observed in the two reference frames: 

Stellar 

(9 27) aberration 
v ' ’ effect 


This expression describes, in particular, the so-called stellar aberration effect, the dependence of the 
observed direction 6 toward a star on the speed v of the telescope motion relative to the star - see Fig. 
8. (The effect is readily observable experimentally as the annual aberration due to the periodic change 


u. 


u 


sin#' 


tan 6 = — = , . - , , . 

u x Y\ u ' x + V ) y(cos#' + v/f/) 
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of speed v by 2ve~ 60 km/s because of Earth’s rotation about the Sun. Since the aberration’s main part 
is of the first order in ve jc ~ 10 , the effect is very significant and has been known since the early 
1700s.) 



Fig. 9.8. Stellar aberration. 


For the analysis of this effect, it is sufficient to take, in Eq. (27), u’ = c, i.e. v/u’ = /?, and 
interpret #’ as the “proper” direction to the star, that would be measured at v = 0. 19 At [X « 1, both Eq. 
(27) and the Galilean result (which the reader is invited to derive directly from Fig. 8), 


tan# = 

may be well approximated by the first-order term 


sin#' 

(9.28) 

cos # ' + /? ’ 

# « /? sin # . 

(9.29) 


Unfortunately, it is not easy to use the difference between Eqs. (27) and (28), of the second order in /?, 
for the special relativity confirmation, because other components of Earth’s motion, such as its rotation, 
nutation and torque-induced precession, 20 give masking first-order contributions to the aberration. 

Finally, at a completely arbitrary direction of vector u ’, Eqs. (22) may be readily used to 
calculate the velocity magnitude. The most popular form of the resulting expression is for the square of 
relative velocity (or rather relative reduced velocity P) of two particles, 




(P.-Pj-lP, IT 
(i-P, P 2 ) 2 


< 1 . 


(9.30) 


where Pi ,2 = \\,tJc are their normalized velocities as measured in the same reference frame. 

(iv) The Doppler effect . Now let us consider a plane, monochromatic wave moving along axis x: 
f = R e[/« exp {i (Ax - oat}] = \f a | cos(Ax -gx + arg f co ) . (9.3 1) 


19 Strictly speaking, in order to reconcile the geometries shown in Fig. 1 (for which all our formulas, including 
Eq. (27), are valid) and Fig. 8 (giving the traditional scheme of the aberration), it is necessary to invert signs of u 
(and hence sin#’ and cos#’) and v, but as evident from Eq. (27), all the minus signs cancel, and the formula is 
valid as is. 

20 See, e.g., CM Secs. 6.4-6.5. 
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Its total phase, = kx -cot + arg f 0) (in contrast to amplitude \f„, !) cannot depend on the observer’s 
reference frame, because all fields of a traveling wave vanish simultaneously at 'F = 27 in, (for all integer 
n) and such “world events” should take place in all reference frames. The only way to keep = V F’ at all 
times is to have 21 

kx - cot = k'x' - co't' . (9.32) 

First, let us consider the Doppler effect describing usual nonrelativistic waves, e.g., oscillations 
of particles of a certain medium. Using the Galilean transfonn (2), we may rewrite Eq. (32) as 

k(x' + vt)~ cot = k'x' - co't . (9.33) 

Since this transform leaves all space intervals (including wavelength A = lir/k) intact, we can take k = k 
so that Eq. (33) yields 

c o' = oo-kv . (9.34) 

For a dispersion- free medium, the wave number k is the ratio of its frequency co, as measured in 
the reference frame bound to the medium, and the wave velocity v w . In particular, if the wave source 
rests in the medium, we can bind frame 0 to the medium as well, and frame 0’ to wave’s receiver (so 
that v = v,.), so that 


k = 



and for the frequency perceived by the receiver, Eq. (34) yields 


co = co- 


v,„ — v_ 


(9.35) 


(9.36) 


On the other hand, if the receiver and the medium are at rest in reference frame 0 ’, while the wave 
source is bound to frame 0 (so that v = -v 5 ), Eq. (35) should be replaced with 

k = k' = —, (9.37) 

and Eq. (34) yields a different result: 

co' = co Vw , (9.38) 

v,„ —v 


Finally, if both the source and detector are moving, it is straightforward to combine these two results to 
get the general relation 


v... — v„ 


co = co- 


v. — v_ 


At low speeds of both the source and receiver, this result simplifies, 


(9.39) 


21 Strictly speaking, Eq. (32) is valid to an additive constant, but for notation simplicity, it may be always made 
equal to zero by selecting (at it has already been done in all relations of Sec. 1) the reference frame origins and/or 
clock tum-on times so that at t = 0 and x = 0, t ’ = 0 and x ’ = 0 as well. 
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c o' ~ co( 1 - fi), fi = 


(9.40) 


but at speeds comparable to v w we have to use the more general Eq. (39). Thus, the usual Doppler effect 
is affected not only by the relative speed (v r — v s ) of wave’s source and detector, but also of their speeds 
relative to the medium in which waves propagate. 

Somewhat counter-intuitively, for the electromagnetic waves the calculations are simpler, 
because for them the propagation medium (ether) does not exist, wave velocity equals ±c in any 
reference frame, and there are no two separate cases: we can always take k = ±oo/c, k’ = ±co’/c. Plugging 
these relations, together with the Lorentz transfonn (19a), into the phase-invariance equation (32), we 
get 


±—y(x' + fict') 
c 


ct' + fix' co', , , 

coy = ± — x -cot . 

c c 


(9.41) 


This relation has to hold for any x’ and / ’, so we may require the net coefficients before these variables 
to vanish. These two requirements yield the same condition: 

o)' = ooy(l + fi). (9.42) 


This result is already quite simple, but may be transformed further to be even more illuminating: 


co' = oo 


i+i 

Mr 


= 00 


'Mii slT 

. mm ). 


(9.43) 


At any sign before fi, one pair of parentheses cancel, so that 


co' = oo 


"l + fi 

fi ±fi 


y/2 


(9.44) 


(It may look like the reciprocal expression of co via co’ is different, violating the relativity principle. 
However, in this case we have to change the sign of fi, because the relative velocity of the system is 
opposite, so we come down to Eq. (44) again.) 


Thus the Doppler effect for electromagnetic waves depends only on the relative velocity v = fic 
between the wave source and detector - as it should be, given the absence of the ether. At velocities 
much below c, Eq. (43) may be crudely approximated as 


co' ~ co 


l + fi/2 
\± fi /2 


®(l + fi), 


(9.45) 


i.e. in the first approximation in fi = v/c it coincides with the corresponding limit (38) of the usual 
Doppler effect. However, even at v « c there is still a difference of the order of (v/c) 2 between the 
Galilean and Lorentzian relations. 


If the wave vector k is tilted by angle 6 to vector v (as measured in frame 0), then we have to 
repeat the calculations, with k replaced by k x , and components k v and k- left intact at the Lorentz 
transform. As a result, Eq. (42) is generalized as 

co' = o?y(l - fi cos 0 ) . (9.46) 
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For the cases cos# = ±1, Eq. (44) reduces to our previous result. However, at 0= nil (i.e. cos# = 0), the 
relation is rather different: 


co’ = yco = 



(9.47) 


This is the transverse Doppler effect - which is completely absent in the nonrelativistic physics. 
Its first experimental evidence was obtained using electron beams (as suggested in 1906 by J. Stark), by 
H. Ives and G. Stilwell in 1938 and 1941. Later, similar experiments were repeated several times, but 
the first unambiguous measurement were perfonned only in 1979 by D. Hasselkamp et al. who 
confirmed Eq. (47) with a relative accuracy about 10%. This precision may not look too spectacular, but 
besides the special tests discussed above, the Lorentz transfonn fonnulas have been also confirmed, less 
directly, by a huge body of other experimental data, especially in high energy physics, being in 
agreement with calculations incorporating the transform as their part. This is why, with every respect to 
the challenging authority spirit, I should warn the reader: you decide to challenge the relativity theory 
(that is called “theory” by tradition only), you would also need to explain all these data. 22 Best luck with 
that! 


9.3. 4-vectors, momentum, mass, and energy 

Before proceeding to relativistic dynamics, let us discuss a mathematical language that makes all 
the calculations more compact - and more beautiful. We have already seen that spatial coordinates {x, y, 
z } and product ct are Lorentz-transformed similarly - see Eqs. (19). So it is natural to consider them as 
components of a 4-component vector (or, for short, 4-vector ), 

{x 0 , X, , x 2 , x 3 } = {ct, r } , (9.48) 


with components 


x 0 = ct, x, = x, x 2 = y, x 3 = z . 


According to Eqs. (19), its components are Lorentz-transformed as 



(9.49) 


(9.50) 


where Lj u - are the elements of the 4x4 Lorentz transform matrix 



[r 

Pr 

0 

(G 



)3y 

Y 

0 

0 



0 

0 

1 

0 



1 ° 

0 

0 

1J 



(9.51) 


Since 4-vectors are a new notion for our course, and are used for much more goals than the just 
the space-time transform, we need to discuss the mathematical rules they obey. Indeed, as was 


22 The same fact, ignored by crackpots, is also valid for other favorite points of their attacks, including the 
Universe expansion and quantum mechanics in physics, and the evolution theory in biology. 


Transverse 

Doppler 

effect 


Space 

-time 

4-vecttor 


4-form of 

Lorentz 

Transform 


Lorentz 

transform 

matrix 


Chapter 9 


Page 14 of 54 


Essential Graduate Physics 


EM: Classical Electrodynamics 


General 

4-vector 


General 

4-vector’s 

Lorentz 

transform 


Lorentz 

invariance 


Scalar 

4-product 


Interval 
between 
two close 
events 


mentioned in Sec. 8.9, the usual (3D) vector is not just any ordered set ( string ) of three scalars {A x , A y , 
A z }; if we want it to represent a reference-frame-independent physical variable, vector components have 
to obey certain rules at transfer from one reference frame to another. In particular, vector’s 3D norm 
(magnitude squared), 

A 2 = A 2 +A 2 y + A 2 , (9.52) 


should be an invariant at the Galilean transform (2). However, a naive extension of this fonnula to 4- 
vectors would not work, because, according to the calculations of Sec. 1, the Lorentz transfonn keeps 
intact combinations of the type (7), with one sign negative, rather than the sum of all components 
squared. Hence for the 4-vector all the rules of the game have to be reviewed and adjusted - or rather 
redefined from the very beginning. 


Arbitrary 4-vector is a string of 4 scalars, 

{A) ’ ^1 ’ ^2 ’ ^3 } ’ 


(9.53) 


defined in 4D Minkowski space , 23 whose components Aj, as measured in systems 0 and 0 ’, shown in Fig. 
1, obey the Lorentz transform similar to Eq. (50): 



(9.54) 


As we have already seen on the example of the space-time 4-vector (48), this means in particular that 

(9.55) 



This is the so-called Lorentz invariance condition of the norm of the 4-vector. (The difference 
between this relation and Eq. (52), pertaining to the Euclidian geometry, is the reason why the 
Minkowski space is called pseudo-EucIidian.) It is also straightforward to use Eqs. (51) and (54) to 
check that an evident generalization of the norm, the scalar product of two arbitrary 4-vectors, 


A„B „-J J A ] B J , 

i = 1 


(9.56) 


is also Lorentz-invariant. 


Now consider the 4-vector corresponding to a infinitesimal interval between two close world 

events: 


{dx 0 , dx l , dx 2 , dx 2 } = {cdt, dr ) ; 


(9.57) 


its norm, 

3 

(ds) 2 = dx\ -^dx 2 = c 2 (dt) 2 - (dr ) 2 , 

7=1 


(9.58) 


23 After H. Minkowski who was first to recast (in 1907) the special relativity relations in a form in which space 
coordinates and time (or rather ct) are treated on an equal footing. 
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is of course also Lorentz-invariant. Since the speed of any particle (or signal) cannot be larger than c, for 
any pair of world events that are in a causal relation with each other, dr cannot be larger than cdt, i.e. 

such time-like interval ( ds ) cannot be negative. The 4D surface separating such intervals from space- 

2 

like intervals (ds) < 0 is called the light cone (Fig. 9). 



Fig. 9.9. 2+1 dimensional image of 
the light cone (which is actually 3+1 
dimensional). 


Now let us assume that these two close world events happen with the same particle that moves 
with velocity u. Then in the frame moving with a particle (v = u), the last term in the right-hand part of 
Eq. (58) equals zero, so that 

ds = cdz , (9.59) 

where dz is the proper time interval. But according to Eq. (21), this means that we can write 

dz = — , (9.60) 

r 


where dt is the time interval in an arbitrary (besides being inertial) reference frame, while 


fi u 1 1 

P = - and y = ^ v 

c (l-/? 2 ) (l -u 2 /c 2 ) 


1/2 


(9.61) 


are the parameters (17) corresponding to particle ’s velocity (u) in that frame, so that ds = cdt//. 24 

Now, let us explore whether a 4-vector can be formed using spatial components of particle’s 
velocity 


J dx dy dz 
(dt’ dt' dt 


(9.62) 


Here we have a slight problem: as Eqs. (22) show, these components do not obey the Lorentz transform. 
However, let us use dz= dt/y the proper time interval of the particle, to fonn the following string: 


24 I have opted against using special indices (e.g., P„, /„) to distinguish Eqs. (17) and (61) here and below, in a 
hope that the suitable velocity (of a reference frame or of a particle) will be always clear from the context. 


Chapter 9 


Page 16 of 54 


Essential Graduate Physics 


EM: Classical Electrodynamics 


4-velocity 


dx 0 dx j dx 2 dx 2 


dz dr dz dz 


Y\c, 


dx dv dz 
dt’ dt ’ dt 



(9.63) 


As follows from comparison of the first form of this expression with Eq. (48), since the time-space 
vector obeys the Lorentz transform, and r is Lorentz-invariant, string (63) is a legitimate 4-vector. It is 
called the 4-velocity of the particle. 


Now we are properly equipped to proceed to dynamics. Let us start with such basic notions of 
momentum p and energy 3 - so far, for a free particle. 25 Perhaps the most elegant way to “derive” (or 
rather guess 26 ) expressions for p and 3 as functions of particle’s velocity u is based on analytical 
mechanics. Due to the conservation of v, the trajectory of a free particle in the 4D Minkowski space is 
always a straight line. Hence, from the Hamilton principle of minimum action, 27 we may expect its 
action S, between points 1 and 2, to be a linear function of the space-time interval (59): 


Free 

particle’s 

action 


2 2 

S = aj ds = acjdz 

i i 


= ac 


‘rch 

Y 

‘x ' 


(9.64) 


where a is some constant. On the other hand, in analytical mechanics the action is defined as 



(9.65) 


where X is particle’s Lagrangian function. 28 Comparing these two expressions, we get 


. ac 

A = — = ac 
Y 


.2 A 


1/2 


1 -- 


In the nonrelativistic limit (u «c), this function tends to 


.2 A 


/ « ac 
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2c 2 


= ac 
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(9.66) 


(9.67) 
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In order to correspond to the Newtonian mechanics, the last (velocity-dependent) tenn should equal 
mu /2. From here we find a = -me, so that, finally, 



( 2 A 

1/2 

/ = - me 2 

i-V 



l c " J 



(9.68) 


25 I am sorry for using, as in Sec. 6.3, for particle’s momentum, the same traditional notation (p) as had been used 
for the dipole electric moment. However, since the latter notion will be virtually unused in the balance of the 
notes, this may hardly lead to confusion. 

26 Indeed, such a derivation uses additional assumptions, however natural (such as the Lorentz-invariance of 5), 
so it can hardly be considered as a real proof of the final results, so that they require experimental confirmation. 
Fortunately, such confirmations have been numerous - see below. 

27 See, e.g., CM Sec. 10.3. 

28 See, e.g., CM Sec. 2.1. 
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Now we can find Cartesian components/?, of particle’s momentum, as the generalized momenta 
corresponding to components ?/■ (j = 1, 2, 3) of the 3D radius-vector r : 29 


0 / 0 / 2 

/? = — 7 - = - — = -me 


1 drj duj 


du 


Mj +u 2 + u 3 


2 A 


1/2 


j \ 


mu , 


(l - u 2 / c 1 ) 


— = m/Uj 


(9.69) 


Thus for the 3D vector of momentum, we can write the result in the same form as in nonrelativistic 
mechanics, 

(9.70) 


p = myu = Mu 


if we introduce the reference-frame-dependent scalar M (called the relativistic mass) defined as 


M = my = 


m 


(l -u 2 /c 2 ) 

1/2 - m ’ 


(9.71) 


m being the nonrelativistic mass of the particle. (It is also called the rest mass, because in the reference 
frame in that the particle rests, Eq. (71) yields M= m.) 

Next, let us return to analytical mechanics to calculate particle’s energy 3 (which for a free 
particle coincides with the Hamiltonian function '3): 


27 


mu' 


r 


t=v = Y J p j u j -£= p- u -/=v v 

7=1 ( 1 -W ic ) 


/ 2 


+ lilt: 


2 A 


V c 


me' 


L 2 , 2 V /2 ‘ 

(1 - u / c ) 


(9.72) 


Thus, we have arrived at the most famous of Einstein’s fonnulas (and probably the most famous formula 
of physics as a whole), 


3 = myc 2 = Me 2 , 


(9.73) 


that expresses the relation between particle’s mass and energy . 30 In the nonrelativistic limit, it reduces to 


3 = 


mcr 


L 2 , 2 V /2 

(1-M ic J 


: me 


.2 A 


1 + - 


2c 2 


2 mu 
= me -\ 


(9.74) 


the first term me 2 being called the rest energy of a particle. 
Now let us consider the following string of 4 scalars: 




(9.75) 


Using Eqs. (70) and (73) to present this expression as 


29 See, e.g., CM Sec. 2.3. 

30 Let me hope that the reader understands that all the layman talk about the “mass to energy conversion” is only 
valid in a very limited sense of the word. While the Einstein relation (73) does allow the conversion of “massive” 
particles (with m ^ 0) into massless particles such as photons, each of the latter particles also has a nonvanishing 
relativistic mass M, and simultaneously the energy related to M by Eq. (73). 


Relativistic 

momentum 


Relativistic 

mass 


£= Me 2 


4-vector of 

energy- 

momentum 
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Free 

particle’s 

energy 



(9.76) 


and comparing the result with Eq. (63), we immediately see that, since m is Lorentz-invariant, this string 
is a legitimate 4-vector of energy-momentum. As a result, its norm, 


i^v 

vcj 


2 


P » 


(9.77) 


is Lorentz-invariant, and in particular has to be equal to the norm in the particle rest frame. But in that 

2 

frame, p = 0, and 3 — me , so that in an arbitrary frame 


u v 

kCJ 


- p 2 = {me) 2 . 


(9.78a) 


This very important relation 31 between the relativistic energy and momentum (valid for free particles 
only!) is usually presented in the form 32 


3 1 = {me 2 ) 1 + {pc) 1 . 


(9.78b) 


2 

According to Eq. (70), in the ultrarelativistic limit u — » c, p tends to infinity while me stays 
constant, so that pc » me . As follows from Eq. (78), in this limit 3 « pc. Though the above discussion 
was for particles with finite m, the 4-vector formalism allows us to consider particles with zero rest mass 
as ultrarelativistic particles for which the above energy-to-moment relation, 


£ = pc, for m = 0 , 


(9.79) 


is exact. Quantum electrodynamics 33 tells us that under certain conditions, electromagnetic field quanta 
(photons) may be also considered as such massless particles, with momentum p = hk. Plugging (the 
modulus of) the last relation into Eq. (78), for photon’s energy we get 3 - pc = hkc = hco. Please note 
that according to Eq. (73), the relativistic mass of a photon is finite: M= &c = h co/c , so that the term 
“massless particle” has a limited meaning: m = 0. For example, the mass of an optical phonon is of the 
order of 10 ' 16 kg. This is not too much, but still a noticeable (approximately one-millionth) part of the 
rest mass m e of an electron. 

The fundamental relations (70) and (73) have been repeatedly verified in numerous particle 
collision experiments, in which the total energy and momentum of a system of particles are conserved - 
at the same conditions at in the nonrelativistic dynamics. (For momentum, this is the absence of external 
forces, and for energy, the elasticity of particle interactions - in other words, the absence of alternative 
channels of energy escape.) Of course, generally, the total energy of the system is conserved, including 
the potential energy of particle interactions. However, at typical particle collisions, the potential energy 


31 Please note one more useful relation following from Eqs. (70) and (73): p =(E/c 2 )u. 

32 It may be tempting to interpret this relation as the perpendicular-vector-like addition of the rest energy me 1 and 
the “kinetic energy” pc, but from the point of view of the total energy conservation (see below), a better definition 
of the kinetic energy is T(u) = £(u) - <f(0). 

33 Briefly reviewed in QM Chapter 9. 


Chapter 9 


Page 19 of 54 


Essential Graduate Physics 


EM: Classical Electrodynamics 


vanishes so rapidly with the distance between them that we can use the momentum and energy 
conservation using Eq. (73). 

As an example, let us calculate the minimum energy 3 mm of a proton (p a ), necessary for the 
well-known high-energy reaction that generates a new proton-antiproton pair, p 0 + p/, — > p + p + p + , 

provided that before the collision, proton p* has been at rest in the lab frame. This minimum evidently 
corresponds to the vanishing relative velocity of the reaction products, i.e. their motion with virtually 
the same velocity (uf m ), as seen from the lab frame - see Fig. 10. 



lab frame 
P b 


t c.o.m. frame 



Fig. 9.10. Fligh-energy proton 
reaction at - schematically. 


Due to the momentum conservation, this velocity should have the same direction as the initial 
velocity (u mm ) of proton p a . This is why two scalar equations: of for the energy conservation, 


me 


(l — M 2 . IcT 

\ min / 


+ me = 


4 me' 


(i -4,/cT 


(9.80a) 


and momentum conservation, 


mu 


+ 0 = 


4 mu 


fin 


1/2 ’ 


(9.80b) 


(1 -uiJcT (1 -uije-y 

are sufficient to find both w mm and u? m . After a conceptually simple but rather tedious solution of this 
system of two nonlinear equations, we get 


4V3 


V3 


-c . 


(9.81) 


2 

Finally, we can use Eq. (73) to calculate the required energy: 3 mm = 7 me . (Note that of the acceleration 
energy 6 me , only 2 me go into the “useful” proton-antiproton pair production.) Proton’s rest mass, m p * 
1.67xl0' 27 kg, corresponds to the rest energy me 2 « 1.502xl0' 10 J « 0.938 GeV, so that £ mm ~ 6.57 GeV. 


The second, more intelligent way to solve the same problem is to use the center-of-mass {c.o.m.) 
reference frame that, in relativity, is defined as the frame in which the total momentum of the system 
vanishes. 34 In this frame, at 3 = <f mm , the velocity and momenta of all reaction products are equal to 
zero, while velocities of protons p a and p/, before the collision are equal and opposite, with some 
magnitude u ’. Hence the energy conservation law becomes 


2 me 


(i-iCicT 


= 4 me" 


(9.82) 


34 Note that according to this definition, the c.o.m.’s radius-vector is R l. k M k r k /I. k M k = 'L k y k m k r k l'Z k y k m k , generally 
different from the well-known expression 'L k m k r k /'L k m k of the nonrelativistic mechanics. 
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readily giving u ’ = (V 3/2) c. This is of course the same result as Eq. (8 1) gives for Uf m . Now we can use 
the fact that the velocity of proton p & in the c.o.m. frame is (-u’), and hence the speed of proton p a is 
(+u Hence we may find the lab-frame speed of proton p„ using the velocity transform formula (25): 


2 u' 


min i , t2 / 2 * 

1 +U / C 

This relation gives the same result as the first method, u min = (4V3/7)c, but in a much simpler way. 


(9.83) 


9.4. More on 4-vectors and 4-tensors 


Contravariant 

and 

covariant 

4-vectors 


This is a good moment to describe a formalism that will allow us, in particular, to solve the same 
proton collision problem in one more (and arguably, the most elegant) way. More importantly, this 
formalism will be virtually necessary for the description of the Lorentz transform of the electromagnetic 
field, and its interaction with relativistic particles - otherwise the formulas would be too cumbersome. 
Let us call the 4-vectors we have used before, 


A"={A„A}, 


(9.84) 


contravariant, and denote them with the top index, and introduce also covariant vectors, 

A. = {A, -A} , 


(9.85) 


marked by the lower index. Now if we form a scalar product of these vectors using the standard (30- 
like) rule, just as a sum of the products of the corresponding components, we immediately get 


A a A a =A a A a =At-A' 


(9.86) 


Here and below the sign of sum of four components of the product has been dropped. 35 

The scalar product (86) is just the norm of the 4-vector in our former definition, and as we 
already know, is Lorentz-invariant. Moreover, the scalar product of two different vectors (also a Lorentz 
invariant), may be written in any of two similar forms: 36 

Scalar 
product 
forms 

again, the only caveat is to take one vector in the covariant, and another in the contravariant form. 

Now let us return to our sample problem (Fig. 10). Since all components {&!c and p) of the total 
4-momentum of our system are conserved at the collision, its norm is conserved as well: 

(da + Pb )a (p a + P = ( 4 P) a ( 4 ^)“ • ( 9 - 88 ) 

Since now the vector product is the usual math construct, we know that the parentheses in the left-hand 
part of this equation may be multiplied as usual. We may also swap the operands and move constant 
factors around as convenient. As a result, we get 


A Q B Q -A-’B = A l 


B a = A a B„ 


(9.87) 


35 This compact notation may take some time to be accustomed to, but can hardly lead to any confusion, due to 
the following rule: the summation is implied always (and only) when an index is repeated twice, once on the top 
and another at the bottom. In these notes, this shorthand notation will be used only for 4-vectors, but not for the 
usual (spatial) vectors. 

36 Note also that, by definition, for any two 4-vectors, A a B a = B a A a . 
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(fa ) a ( Pa )“ + (p b )a (p b )“ + 2 {p a ) a {Pb )“ = 16 P a P° ■ 


(9.89) 


Thanks to the Lorentz-invariance of each of the terms, we can calculate each of them in the 
reference frame we like. For the first two terms in left-hand part, as well as for the right-hand part term, 
it is beneficial to use the frames in which that particular proton is at rest; as a result, each of the left- 
hand part terms equals (me) , while the right-hand part equals 16 (me) . On the contrary, the last term of 
the left-hand part is better evaluated in the lab frame, because in that frame the three spatial components 
of the 4-momentum pb vanish, and the scalar product is the just the product of scalars &c for protons a 
and b. For the latter proton this is just me, so that we get a simple equation, 

(me) 2 +(mc) 2 + 2 — —me = 16(mc) 2 , (9.90) 

c 

immediately giving the final result: <f mm = 7 me" we have already had. 


Let me hope that this example was a convincing demonstration of the convenience of presenting 
4-vectors in the contravariant (84) and covariant (85) forms, 37 with Lorentz-invariant norms (86). To be 
useful for more complex tasks, the formalism should be developed a little bit further. In particular, it is 
crucial to know how do the 4-vectors change under the Lorentz transform. For contravariant vectors, we 
already know the answer (54), but let us rewrite it in the new notation: 


A* = L a p A' p 


(9.91) 


where L a p is the mixed Lorentz tensor (5 1 ): 38 


Lorentz 
transform of 
contravariant 
vectors 



/- 

o 

o 


L“ = 

o 

o 


P 

0 0 10 



o 

o 

o 



(9.92) 


Mixed 

Lorentz 

tensor 


Note that though the position of indices a and (i in the Lorentz tensor notation is not crucial, because it 
is symmetric, it is convenient to place them using the general index balance rule : the difference of the 
numbers of the upper and lower indices should be the same in both parts of any 4-vector/tensor equality, 
with the top index in the denominator of a fraction counted as a bottom index in the nominator, and vice 
versa. (Check yourself that all our formulas above do satisfy this rule.) 

In order to rewrite Eq. (91) in a more general form that would not depend on the particular 
orientation of the coordinate axes (Fig. 1), let us use the contravariant and covariant forms of the 4- 
vector of the time-space interval (57), 


37 These form s are 4-vector extensions of the notions of contravariance and covariance introduced in the 1850s by 
J. Sylvester for the description of 3D vector change at transfer between different reference frames - e.g., axes 
rotation - cf. Fig. 3. In this case, the contravariance or covariance of a vector is uniquely determined by its nature: 
if Cartesian coordinates of a vector (such as the nonrelativistic velocity v = dr/dt) are transformed similarly to the 
radius- vector r, it is called contravariant, while the vectors (such as df/dr = V/ ) that require the reciprocal 
transform, are called covariant. In the Minkowski space, both forms may be used for any 4-vector. 

38 Just as 4-vectors, 4-tensors with two top indices are called contravariant, and those with two bottom indices, 
covariant. Tensors with one top and one bottom index are called mixed. 
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dx a ={cdt,dr\ dx a = {cdt,-dr}; (9.93) 

then its norm (58) may be presented as 39 

(ds) 2 = ( cdt ) 2 -( dr ) 2 = dx a dx a = dx a dx a . (9.94) 

Applying Eq. (91) to the contravariant form of vector (93), we get 

dx a =L a p dx' p . (9.95) 

But, with our new shorthand notation, we can also write the usual rule of differentiation of each 
component x a , considering it as a (in our case, linear) function of 4 arguments x’ p , as follows: 

r)r a 

dx° = ^-dx’ p . (9.96) 

dx’ p 


Comparing Eqs. (95) and (96), we can rewrite the general Lorentz transform rule (92) in the new form, 

General 
form 
of Lorentz 
transform 


which evidently does not depend on the coordinate axes orientation. 



(9.97a) 


It is straightforward to verify that the reciprocal transform may be presented as 


Reciprocal 

Lorentz 

transform 


A 


ta 


8x ,a 

8x p 


A' 


(9.97b) 


However, the reciprocal transform differs from the direct one only by the sign of the relative velocity of 
the frames, so that the transform is given by the inverse matrix dx ’ a !8x p \ for the coordinate choice shown 
in Fig. 1, the matrix is 


dx ,a 

dx p 


-py 

0 


-py 0 0^ 

y 0 0 

0 10 ’ 

0 0 1 , 


(9.98) 


39 Another way to write this relation is (ds) 1 = g a pdx a dx p = g" p dx u dxp, where double summation over indices a 


and P is implied, and g is the so-called metric tensor. 





( 1 

0 

0 

o N 

aB 

g = g afi = 

0 

-1 

0 

0 

0 

0 

-1 

0 


10 

0 

0 

-l 


that may be used, in particular, to a transfer a covariant vector into the corresponding contravariant one 
and back: A a = g aP Ap, A a = gapA p . The metric tensor plays a key role in general relativity, in which it is 
affected by gravity - “curved” by particle masses. 
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Since, according to Eqs. (84)-(85), covariant 4-vectors differ from the contravariant ones by the sign of 
the spatial components, their direct transform is given by matrix (98). Hence their direct and reciprocal 
transforms may be represented, respectively, as 



(9.99) 


evidently satisfying the index balance rule. (Note that primed quantities are now multiplied, rather than 
divided as in the contravariant case.) As a sanity check, let us apply this formalism to the scalar product 
AaA a . As Eq. (96) shows, the implicit summation notation allows us to multiply and divide any equality 
by the same partial differential of a coordinate, so that we can write: 


fix’? F) X a 8x' p 

A„A a = ——A' a A’ r = —A\A' r = 8 n A' nA ,y = A'A' y , 


dx a 8x n 


8x' y 


(9.100) 


i.e. the scalar product AaA a (as well as A a A a ) is Lorentz-invariant, as it should be. 

Now, let us consider the 4-vectors of derivatives. Here we should be very careful. Consider, for 
example, the following vector operator 


8 

8x a 



(9.101) 


As was discussed above, the operator is not changed by its multiplication and division by another 
differential, e.g., dx ,p ' (with the corresponding implied summation over fJ), so that 


8 _ 8x' p 8 
8x a ~ 8x a 8x' p ' 


(9.102) 


But, according to the first of Eqs. (99), this is exactly how the covariant vectors are Lorentz- 
transformed! Hence, we have to consider the derivative over a contravariant space-time interval as a 
covariant 4-vector, and vice versa. 40 (This result might be also expected from the index balance rule.) In 
particular, this means that the scalar product 

—^—A a = + V - A (9.103) 

8x a 8(ct) 


should be Lorentz-invariant for any legitimate 4-vector. A convenient shorthand for the covariant 
derivative, which complies with the index balance rule, is 


8 

8x a 


= 8 


a ’ 


(9.104) 


so that the invariant scalar product may be written just as 8 a A a . A similar definition of the contravariant 
derivative, 



(9.105) 


40 As was mentioned above, this is also a property of the “usual” transform of 3D vectors. 
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allows us to write the Lorentz-invariant scalar product (103) in any of two forms: 

— + V • A = d a A = d a A a . 

d(ct) 


(9.106) 


Finally, let us see how does the general Lorentz transform changes 4-tensors. A second-rank 4x4 
matrix is a legitimate 4-tensor if both 4-vectors it relates obey the Lorentz transform. For example, if 
two legitimate 4-vectors are related as 

A a =T ap B p , (9.107) 

we should require that 

A ,a =T ap B' p , (9.108) 

where A a and A ,a are related by Eqs. (97), while Bp and B ’ p, by Eqs. (99). This requirement immediately 
yields 

Lorentz 
transform 
of 4-tensors 


with the implied summation over two indices, y and 8. The rules for covariant and mixed tensors are 
similar. 41 


yap _ dx a dx p T , yS 
dx' y 8x' s 


8r' a rlr'P 

ry,aP _ UX UX y yd 

dx y dx s 


(9.109) 


9.5. Maxwell equations in the 4-form 

This 4-vector formalism background is already sufficient to analyze the Lorentz transform of the 
electromagnetic field. Just to warm up, let us consider the continuity equation (4.5), 


dp 

dt 


+ V-j = 0 


(9.110) 


4-vector 
of electric 
current 


which expresses the electric charge conservation, and, as we already know, is compatible with the 
Maxwell equations. If we now define the contravariant and covariant 4-vectors of electric current as 


j a ={pc,\\ j a ={p c ~\\ 


(9.111) 


then Eq. (110) may be presented in the form 

Continuity 
equation 
in 4-form 



(9.112) 


showing that the continuity equation is form-invariant 42 with respect to the Lorentz transform. 


Of course, such equation form invariance does not mean that all component values of the 4- 
vectors participating in the equation are the same in both frames! For example, let us have some static 
charge density p in frame 0; then Eq. (97b), applied to the contravariant form of 4-vector (111), reads 


41 It is straightforward to check that transfer between the contravariant and covariant forms of the same tensor 
may be readily achieved using the same metric tensor g : Tap = gay^gsp, T# = g^T^. 

42 Note that in some texts, the equations preserving their form at a transform are called “covariant”, creating a 
possibility for confusion with covariant vectors and tensors. On the other hand, calling such equations “invariant” 
does not distinguish them properly from invariant quantities, such as scalar products of 4-vectors. 
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f)x ,a 

r=4ri fi , f={pc, 0,0,0}. (9.113) 

dr 

Using the explicit form (92) of the reciprocal Lorentz matrix for the coordinate choice shown in Fig. 1, 
we see that this relation yields 

P' = YP , j' x = -YPpc = ~yvp, j' y = j\ = 0 . (9.1 14) 


Since the charge velocity, as observed from frame 0 is (-v), the nonrelativistic result would be j = -xp. 
The additional y factor in the relativistic results for both charge density and current is caused by the 
length contraction: dx' = dx/y, so that in order to keep the total charge dQ = par = pdxdydz inside the 
elementary volume d^r = dxdydz intact, p (and hence j x ) should increase proportionally. 

Next, in the end of Chapter 6 we have seen that Maxwell equations for potentials (j) and A may 
be presented in a similar form (6.109), under the Lorenz (again, not “Lorentz” please!) gauge condition 
(6. 108). For the free space, this condition takes the form 

V • A + -5-— = 0. (9.115) 

c 2 8t 


This expression gives us a hint how to form the 4-vector of potentials: 43 



(9.116) 


indeed, these vectors satisfy Eq. (1 15) in its 4-vector form: 


8 a A„ =dA a =0. 


(9.117) 


Since this scalar product is Lorentz-invariant, and derivatives ( 1 04)-( 1 05) are legitimate 4- 
vectors, this implies that 4-vector (116) is also legitimate, i.e. obeys the Lorentz transform formulas 
(97), (99). A more convincing evidence of this fact may be obtained from Maxwell equations (6.109) for 
the potentials. In free space, they may be rewritten as 


5 2 V 2 

- /X 'l - Mo (/*)’ 

9 2 V 2 

d(ct ) 2 

C £ 0 C 

d(ct ) 2 




(9.118) 


Using definition (116), these equations may be merged to one: 


44 


UA a =p 0 j a , 


where □ is the d’Alembert operator that may be presented as either of two scalar products, 



(9.119) 


(9.120) 


43 In the Gaussian units, the scalar potential should not be divided by c. 

44 In the Gaussian units, coefficient p 0 in the right-hand part of Eq. (119) should be replaced, as usual, with Ante. 

45 Named after J.-B. d’Alembert (1717-1783). Note that in older textbooks, notation Id 2 may be met for this 
operator. 
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and hence is Lorentz-invariant. Because of that, and the fact that the Lorentz transform changes both 4- 
vectors A a and j a in a similar way, Eq. (119) does not depend on the reference frame choice. Thus we 
have arrived at a key point of this chapter: we see that Maxwell equations are indeed form-invariant 
with respect to the Lorentz transform. As a by-product, the 4-vector form (119) of these equations (for 
potentials) is extremely simple - and beautiful. 

However, as we have seen in Chapter 7, for many applications the Maxwell equations for field 
vectors are more convenient; so let us present them in the 4-form. For that, we may express the 
Cartesian components of the usual (3D) field vector vectors 

E = -Vtf) , B = V x A, (9.121) 

dt 


via those of the potential 4-vector A a . For example, 




dx dt 


= -c 


8 8 dA 
— r. + — iL_ 

dx c d(ct ) 


= -c(d°A l -d'A°), 


(9.122) 


J 


= dA^JX 

dy dz 


-(d 2 J-d 3 A 2 ). 


(9.123) 


Completing similar calculations for other field components, we find that the following asymmetric, 
contravariant field-strength tensor. 


F aP = d a A p - d p A a , 


may be expressed via the field components as follows: 46 


Field- 

strength 

tensors 


so that the covariant form of the tensor is 



f 0 

~EJc 

-EJc 

~ EJc \ 



E ic 

0 

-B 
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II 




y 



EJc 

B z 

0 

-b x 



E !c 

-B 

B 

0 


V z 

y 


J 



f 0 

EJc 

EJc 

EJc') 


F* = = 

-EJc 

0 

- B , 

B v 


-EJc 


0 

~B X 



-E Ic 

-B 

B 

0 



\ z 



j 


(9.124) 


(9.125a) 


(9.125b) 


If this expression looks a bit too bulky, note that as a reward, the pair of inhomogeneous 
Maxwell equations, i.e. the two first equations of the system (6.93), which in free space (D = 6qE, B = 
/ 4 )H) may be rewritten as 


V- 


E 


c 


= Ao C P, 


V x B 


d E 

d(ct ) c 


= Aoj> 


(9.126) 


In Gaussian units, this formula, as well as Eq. (131) for G aP , does not have factors c in all the denominators. 
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may be now rewritten in a very simple (and manifestly form-invariant) way, 




(9.127) 


which is comparable with Eq. (119) in its beauty and simplicity. Somewhat counter-intuitively, the pair 
of homogeneous Maxwell equations, 


r)R 

V x E + — = 0, V - B = 0, 
dt 


look, in the 4-vector notation, a bit more complicated: 


•47 


d a F p r + d p F ra +d r F ap =°- 


Note, however, that Eq. (128) may be also represented in a much simpler form, 

8G aP = 0 , 


using the so-called dual (and also asymmetric) tensor 

r 0 fl B 


G aP = 


-B, 


0 


B ^ 

y z 

— Etc EJc 


~B„ EJc 


0 


-EJc 


-B_ -EJc EJc 0 , 


which may be obtained from F aP , given by Eq. (125), by the following replacements: 


E 

c 


B, B 


E 


(9.128) 

(9.129) 

(9.130) 


(9.131) 


(9.132) 


Besides the proof of the form-invariance of the Maxwell equations, the 4-vector formalism 
allows us to achieve our initial goal: find out how do the electric and magnetic field component change 
at the transfer between reference frames. Let us apply to tensor F ap the reciprocal Lorentz transform 
given by the second of Eqs. (109). Generally, it gives, for each field component, a sum of 16 terms, but 
since (for our choice of coordinates, shown in Fig. 1) there are many zeros in the Lorentz transform 
matrix, and diagonal components of F yS equal zero as well, the calculations are rather doable. Let us 
calculate, for example, E’ x = -cF’ 01 . The only nonvanishing terms in the right-hand part are 


E' 


x 


-cF 


01 


' dx'° fix' 1 p W | fix' 0 fix' 1 
, fix 1 fix 0 fix 0 fix 1 . 



(9.133) 


Repeating the calculation for other 5 components of the fields, we get very important relations 

F ' X =E X , B’ x =B x , 

E\ r(E y ~vB z \ B\ = Y {b y +vE z !c 2 \ (9.134) 

E' z = y {e z + vB y ), B ' z = y{b z - vE y / c 2 \ 


47 To be fair, note that just as Eq. (127), Eq. (129) this is also a set of four scalar equations - in the latter case with 
indices a, fi and y taking any three different values of the set {0, 1, 2, 3}. 


First 
pair of 
Maxwell 
equations 
for tensor F 


Second 
pair of 
Maxwell 
equations 
for tensor F 
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whose more compact “semi-vector” form is 

Lorentz 
transform of 
field 

components 

where indices 1 1 and ± stand, respectively, for the field components parallel and perpendicular to the 
relative velocity v of the two reference frames. In the nonrelativistic limit, the Lorentz factor y tends to 
1, and Eqs. (135) acquire an even simpler form 

E ' — » E + v x B, B' — » B — l -\xE. (9.136) 

c 

Thus we see that the electric and magnetic fields actually transform to each other even in the first 
order of the v/c ratio. For example, if we fly across the field lines of a uniform, static, purely electric 
field E (e.g., the one in a plane capacitor) we will see not only the electric field re-normalization (in the 
second order of the v/c ratio), but also a nonvanishing dc magnetic field B ’ perpendicular to both vector 
E and vector v, the direction of our motion. This is of course what might be expected from the relativity 
principle: from the point of view of the moving observer (which is as legitimate as that of a stationary 
observer), the surface charges of capacitor plates, that create field E, move back creating dc currents 
(114) which induce the apparent magnetic field. Similarly, motion across a magnetic field creates, from 
the point of view of the moving observer, an electric field. 

This fact is very important philosophically. One can say there is no such thing in Mother Nature 
as an electric field (or a magnetic field) all by itself. Not only can the electric field induce the magnetic 
field (and vice versa) in dynamics, but even in an apparently static configuration, what exactly we 
measure depends on our speed relative to the field sources - hence the very appropriate term for the 
whole field we are studying: the electromagnetism. 

Another simple but very important application of Eqs. ( 1 34)-( 135) is the calculation of the fields 
created by a charged particle moving in free space by inertia, i.e. along a straight line with constant 
velocity u, at the impact parameter 48 (the closest distance) b from the observer. Selecting frame 0 ’ to 
move with the particle in its origin, and frame 0 to reside in the “lab” (in that fields E and B are 
measured), we can take v = u. In this case fields E’ and B’ may be calculated from, respectively, 
electro- and magnetostatics, because in frame 0 ’ the particle does not move: 

E' = — - — B' = 0. (9.137) 

4 7ie a r' 


E il 

= E I , 

B 'll =B , 

E'x 

= y(E + v x B) ± , 

B' ± =y(B-vxE/c 2 } 


(9.135) 


Selecting the coordinate axes so that at the measurement point x = 0, y = b, z = 0 (Fig. 1 la), we may 

2 2 2 1/2 

write x ’ = -nt ’,y’= b,z’ = 0, so that r ’ = (u t’ + b ) , and the field components are as follows: 


E' 


x 


q ut' 

4^o (u 2 t' 2 +b 2 f 2 ’ 


E'„ 


q b 

4^o (mV 2 +b 2 ) V2 


E', = 0, B' x =B' y =B' z = 0. (9.138) 


Now using the last of Eq. (19b), with x = 0, for the time transform, and the equations reciprocal to Eqs. 
(134) for the field transform (it is evident that they are similar to the direct transform with v replaced 
with -v = -m), in the lab frame we get 


48 This term is very popular in the of particle scattering - see, e.g., CM Sec. 3.7. 
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£ = £' = - 


uyt 


4^o (u 2 y 2 t 2 +b 2 f 2 


> E v =yE' = 


B x = 0, £,=0, B z =^E' y =- 2 q 


_q 

4 ^o (wV 2 £ + < 


c 2 4jcs 0 (u 2 y 2 t 2 +b 2 J 



9 \3/2 ’ 0, 

(9.139) 

— E 

2 * 

(9.140) 

c 


(b) 



Fig. 9.11. Field pulses 
induced by a uniformly 
moving charge. 


These results, 49 plotted in Fig. lib, reveal two major effects. First, the charge passage by the 
observer generates not only an electric field pulse, but also a magnetic field pulse. This is natural, 
because, as was repeatedly discussed in Chapter 5, charge motion is essentially an electric current. 50 
Second, Eqs. (139)-(140) show that the pulse duration scale is 


At = 


b 


yu 



.2 A 


1/2 



(9.141) 


i.e. shrinks to zero as the charge velocity u approaches the speed of light. This is of course a direct 
corollary of the relativistic length contraction: in the frame 0 ’ moving with the charge, the longitudinal 
spread of its electric field at distance b from the motion line is of the order of Ax ’ = b. When observed 
from the lab frame 0, this interval, in accordance with Eq. (20), shrinks to Ax = Ax’/y= b/y, and so does 
the pulse duration scale At = A x/u = b/yu. 


9.6. Relativistic particles in electric and magnetic fields 


Now let us analyze dynamics of charged particles in electric and magnetic fields. Inspired by 
‘our” success of forming the 4-vector (75) of energy-momentum, 


P° =\*,A = y{mc,v} = 


m - 


dx a 

dz 


= mu 


(9.142) 


where u a is the contravariant form of the 4-velocity (63) of the particle, 


49 In the next chapter, we will re-derive them in a different way. 

50 It is straightforward to use Eq. (140) and the linear superposition principle to calculate, for example, the 
magnetic field of a string of charges moving along the same line, and separated by equal distances Ax = a (so that 
the average current, as measured in frame 0, is quia), and to show that the time-average of the magnetic field is 
given by Eq. (5.20) of magnetostatics, with b instead of p. 
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Charged 

particle’s 

dynamics 


Particle’s 
dynamics 
in 4-form 


Particle’s 

energy 

evolution 


U = 


dx a 

dr 


dx a 
u a = — - 
dr 


(9.143) 


we may notice that the nonrelativistic equation of motion, resulting from the Lorentz-force formula 
(5.10) for the three spatial components of p a , at charged particle’s motion in electromagnetic field, 


— = ^(E + uxB), 
dt V ’ 


is fully consistent with the following 4-vector equality (which is evidently form- invariant): 


dp a 

dr 


qF a/3 Un . 


(9.144) 


(9.145) 


For example, the a = 1 component of this equation reads 


dp 1 
dr 


- - 


qF ip u 


—yc + 0 • (-yu x ) + (~B : )(-yu V ) + B ( -yu : ) 
c 


= qy[ E + uxB] ( , (9.146) 


and similarly for two other spatial components (a = 2 and a = 3). We see that these expressions differ 
from the Newton law (144) by the extra factor y. However, plugging into Eq. (146) the definition of the 
proper time interval, dr= dt/y, and canceling y in both parts, we recover Eq. (144) exactly - for any 
velocity of the particle! The only caveat is that if u is comparable with c, p in Eq. (144) has to be 
understood as the relativistic momentum (70) proportional to the velocity-dependent mass M = yin > m 
rather than to the rest mass m. 


The only remaining task is to examine the meaning of the 0 th component of Eq. (145). Let us 
spell it out: 


dp 0 

dr 


= qF^ p u p = q 



f E ) 


( E A 


f E ) 


0- yc + 


(~ru x ) + 


(~7U y )+ 




V C J 


v c J 


V c ) 



E 

= qy--u. 
c 


(9.147) 


Recalling that p° = 31c, and using dr = dt/y again, we see that Eq. (147) looks exactly as the 
nonrelativistic relation for the kinetic energy change, 51 


d3_ 

dt 


= </E ■ u , 


(9.148) 


besides that in the relativistic case the energy has to be taken in the general form (73). 


No question, the 4-component equation (145) of relativistic dynamics is beautiful in its 
simplicity. However, for the solution of particular problems, Eqs. (144) and (148) are frequently 
preferable. As an illustration of this point, let us now use these equations to explore the relativistic 
effects at charged particle motion in uniform, time-independent electric and magnetic fields. In doing 
that, we will, for the time being, neglect the contributions into the field by the particle itself. 52 


51 See, e.g., CM Eq. (1.20) with dp/dt = F = qE. (As a reminder, the magnetic field cannot affect particle’s energy, 
because the magnetic component of the Lorentz force is perpendicular to its velocity.) 

52 As was emphasized earlier in this course, in statics this contribution has to be ignored. In dynamics, this is 
generally not true; these self-action effects will be discussed in Sec. 10.6. 
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(i) Uniform magnetic field. Let the magnetic field be constant and uniform in the “lab” reference 
frame 0. Then in this frame, Eqs. (144) and (148) yield 

— = qu x B, — = 0. (9.149) 

dt dt 

From the second equation, 3 = const, we get u = const, fd=utc= const, y = (1 - (f) AI2 = const, and M = 
yin = const, so that the first of Eqs. (149) may be rewritten as 

— = uxw f , (9.150) 

dt 


where co c is the vector directed along the magnetic field B, with the magnitude equal to the cyclotron 
frequency (sometimes called “gyrofrequency”) 


co„ 


qB _ qB _ qc B 
M ym & 


(9.151) 


If particle’s initial velocity uo is perpendicular to the magnetic field, Eq. (150) describes its 
circular motion, with a constant speed u = uo, in a plane perpendicular to B, and frequency (151). In the 
nonrelativistic limit u « c, when y—>l, i.e. M — » m, the cyclotron frequency is independent on the 
speed, but as the kinetic energy is increased to comparable to the rest energy of the particle, the 
frequency decreases, and in the ultrarelativistic limit, 


B 

co c ~ qc — , at u « c . 
P 


(9.152) 


The cyclotron motion radius may be calculated as R = u/co c ; in the nonrelativistic limit it is 
proportional to particle’s speed, i.e. to the square root of its kinetic energy. However, in the general case 
the radius is proportional to particle’s relativistic momentum rather than its speed: 


u _ Mu _ myu _ 1 p 
co c qB qB q B' 


(9.153) 


so that in the ultrarelativistic limit, when p ~ &c, R is proportional to the kinetic energy. 


This dependence of co c and R on energy are the major factors in design of circular accelerators of 
charged particles. In the simplest of these machines (the cyclotron, invented in 1929 by E. Lawrence), 
frequency co of the accelerating ac electric field is constant, so that even it is tuned to co c of the initially 
injected particles, the drop of the cyclotron frequency with energy eventually violates this tuning. Due to 
this reason, the maximum particle speed is limited to just ~0.1 c (for protons, corresponding to the 
kinetic energy of just ~15 MeV). This problem may be addressed in several ways. In particular, in 
synchrotrons (such as Fermilab’s Tevatron and CERN’s LHC) the magnetic field is gradually increased 
in time to compensate the momentum increase ( B oc p), so that both R (148) and co c (147) stay constant, 
enabling proton acceleration to energies as high as ~ 7 TeV, i.e. -2,000 me 2 . 53 


53 For more on this topic, I have to refer the interested reader to special literature, for example either S. Lee, 
Accelerator Physics, 2 nd ed., World Scientific, 2004, or E. Wilson, An Introduction to Particle Accelerators, 
Oxford U. Press, 2001. 
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Returning to our initial problem, if particle’s initial velocity has a component u\\ along the 
magnetic field, it is conserved in time, so that the trajectory is a spiral around the magnetic field lines. 
As Eqs. (149) show, in this case Eq. (150) remains valid, but in Eqs. (151) and (153) the full speed and 
momentum have to be replaced with magnitudes of their (also time-conserved) components, u± and p±, 
nonnal to B, while the Lorentz factor yin those formulas still requires the full speed of the particle. 

Finally, in the special case when particle’s initial velocity is directed exactly along the magnetic 
field’s direction, it continues to move by straight line along vector B. In this case, the cyclotron 
frequency (151) remains finite, but does not correspond to any real motion, because R = 0. 

(ii) Uniform electric field. This problem is (technically) more complex than the previous one, 
because in the electric field, particle’s kinetic energy may change. Directing axis z along the field, from 
Eq. (144) we get 


dp* 

dt 


= qE, 



(9.154) 


If the field does not change in time, the first integration of these equations is trivial, 

p : (t) = p : (0) + qEt, p ± (0 = const = p x (0), (9.155) 


but the further integration requires care, because the effective mass M= ym of the particle depends on its 
full speed: 

u~ =u] +u]_, (9.156) 

making the two motions, along and across the field, mutually dependent. 

If the initial velocity is perpendicular to field E, i.e. if p : {Qi) = 0, /?_l(0) = p(0) = p 0 , the easiest 
way to proceed is to calculate the kinetic energy first: 


3 2 ={mc 2 ) 2 +c 2 p 2 (t) = #l +c 2 (qEt) 2 , where <£ 0 = [(me 2 ) 2 +c 2 pl\ 2 . (9.157) 


On the other hand, we can calculate the same energy by integrating Eq. (148), 


over time, with a simple result: 


d£ „ dz 

— = qE ■ u = qE — , 
dt dt 


£ = Z 0 + qEz(t), 


(9.158) 


(9.159) 


where (for the notation simplicity) I took z(0) = 0. Requiring Eq. (159) to give the same d ~ as Eq. 
(157), we get a quadratic equation for z(t). 


$1 +c 2 (qEt ) 2 = [& 0 + qEz(t)f , 


(9.160) 


whose solution (with the sign before the square root corresponding to E > 0, i.e. z > 0) is 


*1 

1 + 

f cqEt 'I 

2 

1/2 

-1 

qE 

_ 

l J 

J 



(9.161) 
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Now let us find particle’s trajectory. Selecting axis x so that the initial velocity vector (and hence 
the velocity vector at any further instant) is within the [x, z ] plane, i.e. y(t) = 0, we may use Eqs. (155) to 
calculate trajectory’s slope, at its arbitrary point, as 


dz _dz! dt _ Mu, _ p, _ qEt 
dx dx/dt Mu x p x p 0 


(9.162) 


Now let us use Eq. (160) to express the nominator of this fraction, qEt, as a function of z: 

qEt = -\d () +qEzf 
c 


Plugging this expression into Eq. (161), we get 


dz 

dx 


—[(A* +qEzf 
cp 0 



(9.163) 


(9.164) 


This differential equation may be readily integrated, separating variables z and x, and using substitution 
£, = arccosh(c/£z/<f 0 +1). Selecting the origin of axis x at the initial point, so thatx(O) = 0, we finally get 
the trajectory: 


z 


qE 


f 

cosh 

V 


qEx 

cp 0 


\ 

1 . 


(9. 1 65) 


At the initial part of the trajectory, where qEx « c/? 0 (0), this expression may be approximated 
by the first nonvanishing tenn of the Taylor series, giving a parabola: 


f \ 2 

£ 0 qE ( x 

2 \cp*) ’ 


(9.166) 


2 

so that if the initial velocity of the particle is much less than c (i.e. po ~ muo, A) « me), we get the 
familiar nonrelativistic formula: 


z = 


d E 2 

2 

2 mu 0 





(9.167) 


This solution may be readily generalized to the case of an arbitrary direction of particle’s initial 
velocity; this generalization is left for reader’s exercise. 


(iii) Crossed uniform magnetic and electric fields (E _L B). In the view of how bulky the solution 
of the previous problem (i.e. the particular case of the current problem for B = 0) was, one might think 
that this problem should be forbiddingly complex for an analytical solution. Counter-intuitively, it is not 
the case, due to the help from the field transfonn relations (135). Let us consider two possible cases. 

Case I: Elc < B. Let us consider an inertial frame moving (relatively the “lab” reference frame 0 
in which fields E and B are defined) with velocity 


v = 


E x B 

B 2 ’ 


(9.168) 


whose magnitude v = cx(E/c)/B < c. Selecting the coordinate axes as shown in Fig. 12, so that 
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E x = 0, E v = E, E z =0; B = 0, 5=0, B, = 0 . 


we see that the Cartesian components of this velocity are v x = v, v v = v z = 0. 


(9.169) 



Fig. 9.12. Particle’s trajectory in 
crossed electric and magnetic fields 
(at E/c < B). 


Since this choice of coordinates complies with that used to derive Eqs. (134), we can readily use 
that simple form of the Lorentz transfonn to calculate field components in the moving reference frame: 


E\= 0, E' y = y{E - vB) = 
B' x = o, B'= 0, B' z =r 


r 


7 


E \ 
E B 

B J 


= 0, E' z = 0, 



f n vE^ 


fi vE } 


f vM 

Y 

B y 

% 

ii 

1 D 2 

% 

II 



K c~ ) 


l Be J 


1 c J 


7 


(9.170) 

(9.171) 


2 2 1/2 

where the Lorentz parameter y = (1 - v /c )' corresponds to velocity (168) rather than that of the 
particle. 

Thus in this special reference frame the particle only sees a (re-normalized) unifonn magnetic 
field 5 ’ < 5, parallel to the initial field, i.e. perpendicular to velocity (168). Using the result of the above 
example (i), we see that in this frame the particle will move along either a circle or a spiral winding 
about the direction of the magnetic field, with angular speed (151), 


and radius (148): 


qB 




Z’/c * 1 ’ 



(9.172) 


(9.173) 


Hence in the lab frame, the particle will perform such orbital motion plus a “drift” with constant velocity 
v (Fig. 12). As the result, the lab-frame trajectory of the particle (or rater its projection onto the plane 
perpendicular to the magnetic field) is a trochoid - like curve 54 that, depending on the initial velocity, 
may be either prolate (self-crossing), as in Fig. 12, or curtate (stretched so much that it is not self- 
crossing). 


54 As a reminder, a trochoid may be described as the trajectory of a point on a rigid disk rolled along a straight 
line. Its canonical parametric presentation is x = 0 + acos 0, y = asin 0. (For a > 1, the trochoid is prolate, if a < 

1, it is curtate, and if a = 1, it is called the cycloid.) Note, however, that for our problem, the trajectory in the lab 
frame is exactly trochoidal only in the nonrelativistic limit v « c (i.e. E/c « B), because otherwise the Lorentz 
contraction in the drift direction squeezes the cyclotron orbit from a circle into an ellipse. 
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Such looped motion of electrons (in practice, with v « c) is used, in particular, in magnetrons - 
generators of microwave radiation. In these devices (Fig. 13a), the magnetic field, usually created by 
specially-shaped permanent magnets, is nearly uniform (in the region of electron motion) and directed 
along magnetron’s axis, while the electric field of magnitude E « cB, created by the dc voltage applied 
between the anode and cathode, is virtually radial. As a result, the above simple theory is only 
approximately valid, and electron trajectories are close to epicycloids rather than trochoids. The applied 
electric field is adjusted so that these trajectories pass close to the gap openings to cylindrical 
microwave cavities drilled in magnetron’s bulk anode (Fig. 13b). The fundamental mode of each cavity 
is quasi-lumped, with cylindrical walls working mostly as lumped inductances, and gaps as lumped 
capacitances, with the microwave electric field concentrated in the gap openings. This is why the mode 
is strongly coupled to the passing electrons, and their interaction creates large positive feedback 
(equivalent to negative damping) that results in intensive microwave self-oscillations at cavities’ 
eigenfrequency. 55 The oscillation energy, of course, is taken from the dc-field-accelerated electrons; due 
to the energy loss each electron gradually moves closer to the anode and finally lands on its surface. The 
wide use of such generators (in particular, in microwave ovens, which operate in a narrow frequency 
band around 2.45 GHz, allocated for these devices to avoid their interference with wireless 
communication systems) is due to their simplicity and high (up to 65%) efficiency. 
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Fig. 9.13. Magnetron. (Adapted from 
http://microwavetubes.iwarp.com .) 


Case II: Etc > B. In this case, the speed given by Eq. (168) would be above the speed of light, so 
let us introduce a reference frame moving with a different velocity, 


v = 


E x B 

Wcf 


(9.174) 


whose direction is the same as before (Fig. 12), and magnitude v = cxB/(E/c) is again below c. A 
calculation absolutely similar to the one performed above for Case I, yields 


E' x = 0, E' v =y{E-vB)=yE 


c 


vB 


1 

E 7 


= yE 


f 

i-C 


v 


7 


= -<E, E'.= 0, (9.175) 

r 


55 See, e.g., CM Sec. 4.4. 
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B' x = 0, B'= 0, B' z = r 


b- v 4 

V c J 


r 


B- 


EB 


= 0 . 


(9.176) 


so that in the moving frame the particle sees only an electric field E’ < E. According to the solution of 
our previous problem (ii), the trajectory of the particle in the moving frame is hyperbolic, so that in the 
lab frame it has an “open”, hyperbolic character as well. 

To conclude this section, let me note that if the electric and magnetic fields are non-uniform, the 
particle motion is much more complex, and in most cases the integration of equations (144), (148) may 
be carried out only numerically. However, if the field nonuniformity is small, (approximate) analytical 
methods may be very effective. For example, if the magnetic field has a small longitudinal gradient VB 
in a direction perpendicular to vector B itself, such that 

VB 
B 


1 

R 


(9.177) 



where R is the cyclotron radius (153), then it is straightforward to use Eq. (150) to show 56 that the 
cyclotron orbit drifts perpendicular to both B and VB, with speed 


Jl_ 

co„ 


1 


—u 2 + u ’ 

v2 11 


« u . 


(9.178) 


The physics of this drift is rather simple: according to Eq. (153), the instant curvature of the 
cyclotron orbit is proportional to the local value of the field. Hence if the field is nonuniform, the 
trajectory bends more on its parts passing through stronger field, thus acquiring a shape close to a curate 
trochoid. 


For engineering and experimental practice, effects of longitudinal gradients of magnetic field on 
charged particle motion are much more important, but let me postpone their discussion until we have got 
a little bit more analytical tools in the next section. 


9.1 . Analytical mechanics of charged particles 

Equation (145) gives a full description of relativistic particle dynamics in electric and magnetic 
fields, just as the 2 nd Newton law (1) does it in the nonrelativistic limit. However, we know that in the 
latter case, the Lagrange formalism of analytical mechanics allows an easier solution of many 
problems. 57 We can fully expect that to be true in relativistic mechanics as well, so let us expand the 
analysis of Sec. 3 to particles in the field. 

Let recall that for a free particle, our main result was Eq. (68), which may be rewritten as 

y/ = -me 2 , (9.179) 

showing that this product is Lorentz-invariant. How can the electromagnetic field affect this relation? In 
electrostatics, we could write 

/. = T-U = T-q<f>. (9.180) 


56 See, e.g., Sec. 12.4 in J. D. Jackson, Classical Electrodynamics, 3 rd ed., Wiley, 1999. 

57 See, e.g., CM Sec. 2.2 and beyond. 


Chapter 9 


Page 37 of 54 





Essential Graduate Physics 


EM: Classical Electrodynamics 


However, in relativity the scalar potential (f> is just one component of the potential 4-vector (116). The 
only way to get a Lorentz-invariant contribution to from the full 4-vector, that would be also 
proportional to the Lorentz force, i.e. to the first power of particle’s velocity (to account for the 
magnetic component of the Lorentz force), is evidently 

y/ = -me 1 + const xu a A a , (9.181) 


where u a is the 4-velocity (63). In order to comply with Eq. (180) in electrostatics, the constant factor 
should be equal to (- qc ), so that Eq. ( 1 82) becomes 

y/ = -me 2 - qu a A a , (9.182) 


and, finally, 


i.e., in the Cartesian form, 



qtj) + q u • A , 


(9.183) 


Lagrangian 

function 


f 

/ = - me 2 1 
V 


2 2 2 "N 

u ; +u : + u; 


1/2 


q</>+q(u , 


+ u y A y 


+ u_A : ). 


(9.184) 


Let us see whether this relation (that admittedly was obtained above by an educated guess rather 
than by a strict derivation) passes a natural sanity check. For the case of unconstrained motion of a 
particle, we can select its three Cartesian coordinates y- (j = 1, 2, 3) as the generalized coordinates, and 
linear velocity components Uj as the corresponding generalized velocities. In this case, the Lagrange 
equations of motion are 58 


d df. df 

dt duj drj 


For example, for r\ = x, Eq. (184) yields 


so that Eq. (185) takes the form 

dp da J 3 A dA 

= -q — + q u q — - . 

dt dx dx dt 


. . df. dtb dA 

XUY + d A , = P, + dA, -^=~d — + d»- — : 

V ox dx dx 


d/. _ mu x 
Su x (l - u 2 /c 2 


(9.185) 


(9.186) 


(9.187) 


In equations of motion, field values have to be taken at the instant position of the particle, so that 
the last (full) derivative has components due to both the actual field change (at a fixed point of space) 
and the particle’s motion. Such addition is described by the so-called convective derivative 59 


d_ 

dt 


d_ 

dt 


+ u V . 


(9.188) 


Convective 

derivative 


58 See, e.g., CM Sec. 2.1. 

59 Alternatively called the “Lagrangian derivative”; for its (rather simple) derivation see, e.g., CM Sec. 8.3. 
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Spelling out both scalar products, we may group the terms remaining after cancellations as follows: 


d Px 

dt 


v dx 


S4 

dt 


+ u , 


8A y 

dx 


54 

dy 


-u. 


S4 

dz 


aO 

dx J 


(9.189) 


But taking into account relations (121) between the electric and magnetic fields and potentials, this 
expression is nothing more than 

= q [E x + u B z - u z B ) = q(E + u x B), , (9. 190) 

dt 

i.e. the x-component of Eq. (144). Since other Cartesian coordinates participate in Eq. (184) in a similar 
way, it is evident that the Lagrangian equations of motion along other coordinates yield other 
components of the same vector equation of motion. 

So, Eq. (183) does indeed give the correct Lagrangian function, and we can use it for the further 
analysis, in particular to discuss the first of Eqs. (186). This relation shows that in the electromagnetic 
field, the generalized momentum corresponding to particle’s coordinate x is not p x = myu x , but 60 

P,=T L = P,*<tK- (9.191) 

du x 


Thus, as was already mentioned in brief in Sec. 6.3, particle’s motion in a field may be is described by 
two momentum vectors: the kinetic momentum p, defined by Eq. (70), and the canonical (or 
“conjugate”) momentum 61 


Particle’s 

canonical 

momentum 


P = p + qA . 


(9.192) 


In order to facilitate the discussion of this notion, let us generalize expression (72) for the 
Hamiltonian function ft of a free particle to the case of a particle in the field: 


# = P- u- / = (p + qA.) • u 


f 2 

me 


r 


+ gu • A -q(j) 


\ 2 

me , 
= p u H h q<p . 




r 


(9.193) 


Merging the first two terms exactly as it was done in Eq. (72), we get an extremely simple result, 

ft = ymc 2 + q(j) , (9.194) 


that may leave us wondering: where is the vector-potential A here - and the field effects is has to 
describe? The resolution of this puzzle is easy: for a practical use (e.g., for the alternative derivation of 
the equations of motion), # has to be presented as a function of particle’s generalized coordinates (in 
the case of unconstrained motion, these may be the Cartesian components of vector r that serves as an 
argument for potentials A and <f>), and the generalized momenta, i.e. the Cartesian components of vector 
P (plus, generally, time). Hence, velocity u and factor y should be eliminated from Eq. (194). This may 
be done using relation (192), ymu = P - qA. For such elimination, it is sufficient to notice that according 


60 With regrets, I have to use the same (common) notation as was used earlier for the electric polarization - which 
is not discussed below. 

61 In Gaussian units, Eq. (192) has the form P = p + qAJc. 
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to Eq. (193), difference (W- q</>) is equal to the right-hand part of Eq. (72), so that the generalization of 
Eq. (78) is 62 


(7f-q(/)) 2 ={mc 2 ) 2 + c 2 (T* - qA) 2 . 


(9.195) 


Particle’s 

Hamiltonian 


It is straightforward to verify that the Hamilton equations of motion for three Cartesian coordinates of 
the particle, obtained in the regular way 63 from this 'H, may be merged into the same vector equation 
(144). In the nonrelativistic limit, the Taylor expansion of Eq. (195) to the first term in p yields the 
following generalization of Eq. (74): 


2 i 

ft-mc 2 «P— + U = — (E-qA) 2 +U, U = q</>. 
2m 2m 


(9.196) 


This expression for 7f, and Eq. (183) for /, give a clear view of the electromagnetic field effect 

account in analytical mechanics. The electric part of the total Lorentz force q(E + uxB) can perform 
work on the particle, i.e. change its kinetic energy - see Eq. (148) and its discussion. As a result, the 
scalar potential <j), whose gradient gives a contribution into E, may be directly associated with potential 
energy U = q(j). On the contrary, the magnetic component <:/uxB of the Lorentz force is always 
perpendicular to particle’s velocity u, and cannot work on it, and as a result cannot be described by a 
contribution to U. However, if A did not participate in functions / and/or /Vat all, analytical mechanics 

would be unable to describe effects of magnetic field B = VxA on particle’s motion. Relations (183) and 
(197) show the wonderful way in which physics (or Mother Nature herself?) solves this problem: the 
vector-potential gives such contributions to both / and ft (if the latter is considered, as it should be, a 
function of P rather than p) that cannot be uniquely attributed to either kinetic or potential energy, but 
ensure the correct equation of motion (144) in both the Lagrange and Hamilton formalisms. 

I believe I still owe the reader a clear discussion of the physical sense of the canonical 
momentum P. For that, let us consider a particle moving near a region of localized magnetic field B(r,/), 
but not entering this region. If there is no electrostatic field (no other electric charges nearby), we can 
select such a local gauge that (jftr, t) = 0 and A = A (t), so that Eq. (144) is reduced to 


dp dA 

— = qE = -q , 

dt dt 


(9.197) 


immediately giving 



(9.198) 


Hence, even if the magnetic field is changed in time, so that the induced electric field accelerates the 
particle, its conjugate momentum does not change. Hence P is a variable more stable to magnetic field 
changes than its kinetic counterpart p. This conclusion may be criticized because it relies on a specific 
gauge, and generally P = p + qA is not gauge-invariant, because vector-potential A isn’t. 64 However, as 


62 This relation may be also obtained from the expression for the Lorentz-invariant norm, p a p a - (me) 2 , of the 4- 
momentum (75),/?“= {£/c, p} = {(ft- q<f>)/c, P -qA}. 

63 See, e.g., CM Sec. 10.1. 

64 The kinetic momentum p = Mu is just the usual mu product modified for relativistic effects, so that this variable 
is evidently gauge- (though not Lorentz-) invariant. 
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was already discussed in Sec. 5.3, integral jA-c/r over a closed contour does not depend on the chosen 
gauge and equals to the magnetic flux ® through the area limited by the contour - see Eq. (5.65). 
Integrating Eq. (197) over a closed trajectory of a particle (Fig. 14), and over the time of one orbit, we 
get 

A|p -dr = -gA®, so that a|p -dr = 0 , (9.199) 

c c 

where AO is the change of flux during that time. This gauge-invariant result confirms the above 
conclusion about the stability of the canonical momentum to magnetic field variations. 



Fig. 9.14. Particle’s motion around a localized 
magnetic flux. 


Generally, Eq. (199) is invalid if a particle moves inside a magnetic field and/or changes its 
trajectory at the field variation. However, if the field is almost uniform, i.e. its gradient small in the 
sense of Eq. (177), this result is (approximately) applicable. Indeed, analytical mechanics 65 tells us that 
for any canonical coordinate-momentum pair {qj,Pj}, the corresponding action variable, 

J j =^§ p j dg j , (9.200) 

is asymptotically constant at slow variations of motion conditions. According to Eq. (191), for a particle 
in magnetic field, the generalized momentum corresponding to Cartesian coordinate is Pj rather than 
Pj. Thus forming the net action variable J = J x + J y +J~, we may write 

2nJ = j)P -dr = |p • dr + q<t> = const . (9.201) 

Let us apply this relation to the motion of a nonrelativistic particle in an almost uniform 
magnetic field, with a small longitudinal velocity, -u \ \! u± — » 0 (Fig. 15). 



Fig. 9.15. Particle in a magnetic field with 
a small longitudinal gradient VB \ | B. 


In this case, ® in Eq. (201) is the flux encircled by a cyclotron orbit, equal to (- ttR 2 B ), where R is 
its radius given by Eq. (153), and the negative sign accounts for the fact that the “correct” direction of 


65 See, e.g., CM Sec. 10.2. 
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the normal vector n in the definition of flux, ® = jBn d 2 r, is antiparallel to vector B. At u « c, the 
kinetic momentum is just p±_ = mu±, while Eq. (153) yields 

mu L = qBR . (9.202) 

Plugging these relations into Eq. (201), we get 

2 tzJ = mu L 2nR - qnR 1 B = m 2nR - qnR 1 B = (2 - X)qnR 2 B = -q<t> . (9.203) 

m 

This means that even if the circular orbit slowly moves in the magnetic field, the flux encircled 
by the cyclotron orbit should remain constant. One manifestation of this effect is the result already 
mentioned in the end of Sec. 6: if a small gradient of the magnetic field is perpendicular to the field 
itself, particle orbit’s drift is perpendicular to VB, so that ® stays constant. Now let us analyze the case 
of a small longitudinal gradient, VB \ | B (Fig. 15). If the small initial longitudinal velocity u\\ is directed 
toward the higher field region, in order to keep ® constant, the cyclotron orbit has to gradually shrink. 
Rewriting Eq. (202) as 

kR 2 B I®! 

mu± =q—- = q'-l i (9.204) 

nR kR 

we see that this reduction of R (at constant ®) should increase the orbiting speed u±, But since the 
magnetic field cannot do work on the particle, its kinetic energy, 

* = +«:), (9.205) 

should stay constant, so that the longitudinal velocity u\\ has to decrease. Hence eventually orbit’s drift 

has to stop, and then the orbit has to start moving back toward the region of lower fields, being 
essentially repulsed from the high-field region. This effect is very important, in particular, for plasma 
confinement: two coaxial magnetic coils, inducing magnetic fields of the same direction (Fig. 16), 
naturally form a “magnetic bottle” that traps charged particles injected, with sufficiently low 
longitudinal velocities, into the region between the coils. Such bottles are the core components of the 
(generally, very complex) systems used for plasma confinement, in particular in the context of the long- 
term efforts to achieve controllable nuclear fusion. 66 



B 


Fig. 9.16. Magnetic bottle (VERY schematically). 


Returning to the constancy of magnetic flux encircled by free particles, it reminds us of the 
Meissner-Ochsenfeld effect discussed in Sec. 6.3, and gives a motivation for a brief revisit of the 
electrodynamics of superconductivity. As was emphasized in that section, superconductivity is a 


66 For the further reading on this technology, the reader may be referred, for example, to a simple monograph by 
F. C. Chen, Introduction to Plasma Physics and Controllable Fusion, vol. 1, 2 nd ed.. Springer, 1984, and/or a 
graduate-level theoretical treatment by R. D. Hazeltine and J. D. Meiss, Plasma Confinement, Dover, 2003. 
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substantially quantum phenomenon; nevertheless the notion of the conjugate momentum P helps to 
understand its description. Indeed, the general rule of quantization of physical systems 67 is that each 
canonical pair {qj, pj} of a generalized coordinate and the corresponding momentum is described by 
quantum-mechanical operators that obey the following commutation relation 

(9.206) 

According to Eq. (191), for Cartesian coordinates of a particle in electromagnetic field, the 
corresponding generalized momenta are Pj, so that their operators should obey the following 
commutation relations: 



(9.207) 


In the coordinate representation of quantum mechanics, canonical momentum operators are 

2 

described by Cartesian components of the vector operator -ifrV. As a result, ignoring the rest energy me 
(which gives an inconsequential phase factor exp {-imc 2 t/h} in the wave function), we can use Eq. (196) 
to rewrite the nonrelativistic Schrodinger equation, 

if l 6 ^- = Wy/ , (9.208) 

as follows: 


ih 


dy/ 

dt 


( -2 X 


2- + U 

¥ = 

\2m J 


— ihV - qAy + q</> 
2 m 


W 


(9.209) 


Thus, I believe I have finally delivered on my promise to justify the replacement (6.44) which 
had been used in Chapter 6 to discuss electrodynamics of superconductors, including the Meissner- 
Ochsenfeld effect. 68 


9.8. Analytical mechanics of electromagnetic field 

We have just seen that analytical mechanics of a particle in an electromagnetic field may be used 
to get some important results. The same is true for the analytical mechanics of the field alone, and the 
field-particle system as a whole, which will be discussed in this section. For such a space-distributed 
system as the field, governed by local dynamics laws (Maxwell equations), we need to apply analytical 
mechanics to the local densities £ and 4 of the Lagrangian and Hamiltonian functions, defined by 
relations 

/ = | td 3 r, # = f*d 3 r. (9.210) 

Let us start, as usual, from the Lagrange formalism. Some clue on the possible structure of the 
Lagrangian density £ may be obtained from that of the description of the particle-field interaction in this 


67 See, e.g., CM Sec. 10.1. 

68 Equation (209) is also the basis for discussion of numerous other magnetic field phenomena, including the 
Aharonov-Bohm and quantum Hall effects - see, e.g., QM Secs. 3. 1-3.2. 
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formalism, which was discussed in the last section. For the case of a single particle, the interaction is 
described by the last two terms of Eq. (183): 

An =~q</>-qu-A. (9.211) 

It is obvious that if charge q is continuously distributed over some volume, we may present / as a 
volume integral of Lagrangian density 

Interaction 
(9.212) Lagrangian 
density 

Notice that the density (in contrast to A i nt itself) is Lorentz-invariant. (This is due to the 
contraction of the longitudinal coordinate, and hence volume, at the Lorentz transform.) Hence we may 
expect the density of field’s Lagrangian to be Lorentz-invariant as well. Moreover, in the view of the 
simple, local structure of the Maxwell equations (containing only first spatial and temporal derivatives 
of the fields), £ should be a simple function of potential’s 4-vector and its 4-derivative: 

£ = £(A a ,d a A p ). (9.213) 

Also, the density should be selected in such a way that the 4-vector analog of the Lagrangian equations 
of motion, 


= ~P ( t ) + j ' A = -j a A a . 




8A l 


= 0 , 


(9.214) 


gave us correct inhomogeneous Maxwell equations (127). 69 ’ 70 It is clear that the field part 4eid of the 
total Lagrangian density £ should be a scalar, and a quadratic form of the field strength, i.e. of F a ^, so 
that the natural choice is 


^eid = const xF/“^. 


(9.215) 


with implied summation over both indices. Indeed, adding to this expression the interaction Lagrangian 

( 212 ), 

< = = const (9.216) 


and performing differentiation, we may check that Eq. (214) indeed yields Eqs. (127), provided that the 
constant factor equals (-1/4/zo). 71 With that, the field Lagrangian 

Field’s 

(9.217) Lagrangian 
density 


where u e is the local density of the electric field energy density (1.67), and u m is the magnetic field 
energy density (5.57). 


' field 


4 Mi 


_ 27 J7 a P - 

r ap r ~ 


2 Ao 


£ 27 2 A 

V C 


— —E 2 — ^ 
~ 2 


2 Ao 


= -U. 


69 As a reminder, the homogeneous Maxwell equations (129) are satisfied by the very structure (125) of the field 
strength tensor. 

70 Here the implicit summation over index a plays the role similar to the convective derivative (188) in replacing 
the full derivative over time, in a way that reflects the symmetry of time and space in special relativity. I do not 
want to spend more time to justify Eq. (214) because of the reasons that will be clear very soon. 

71 In the Gaussian units, the coefficient is (- 1 / 1 6 zr) . 
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Symmetric 

energy- 

momentum 

tensor 


Let me hope the reader agrees that Eq. (217) is a wonderful result, because the Lagrangian 
function has the structure absolutely similar to the well- kn own expression X = T - U of the classical 
mechanics. So, for the field alone, the “potential” and “kinetic” energies are separable again. 72 

As a sanity check, let us explore whether we can calculate a 4-vector analog of the Hamiltonian 
function 'ft. In the generic analytical mechanics, 


*■ = £— j,-*. 


(9.218) 


J “1; 


However, just as for the Lagrangian function, for a field we should find the spatial density 4 of the 
Hamiltonian, defined by the second of Eqs. (210), for which a natural 4-fonn of Eq. (218) is 

dC 


U aP =■ 


-d p A y -g aP e. 


d(d a A Y ) 

Calculated for the field alone, i.e. using Eq. (217) for i, this definition yields 


/ afi _ nap ap 
Afield -V ~ T D > 


where tensor 


1 f 

e aP = — 

Ao 




V 


is gauge-invariant, while the remaining term, 


r f=— g aY Fd *A p , 
Ao 


(9.219) 


(9.220) 


(9.221) 


(9.222) 


is not, so that it cannot correspond to any measurable variables. Fortunately, it is straightforward to 
verify that tensor td may be presented in the form 

r f =—d r {F ra A p ), (9.223) 

Ao 

and as a result obeys the following relations: 

d a Tf=0, | d 3 r = 0, (9.224) 

so it does not interfere with the conservation properties of the gauge-invariant, symmetric energy- 
momentum tensor (also called the symmetric stress tensor ) to be discussed below. 

Using Eqs. (125), components of the latter tensor may be expressed via the electric and magnetic 
fields. For a = f3= 0, 

S K = ^ = ^E 1 + A- = «,+«„=« , (9.225) 

2 2/u 0 


72 Since the Lagrange equations of motion are homogeneous, the simultaneous change of sign of T and U does not 
change them. Thus, it is not important which of two energy densities, u e or u m , we count as the potential energy. 
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i.e. the expression for the total energy density u - see Eq. (6.104b). The other 3 components of the same 
row/column turn out to be just the Cartesian components of the Poynting vector, divided by c: 


d j0 = — 
Mo 


(E > 


f E 'j 

— xB 

= 

x H 

v c 7 

j 

J 


= —, for 7 = 1, 2, 3. 
c 


(9.226) 


The remaining 9 components Ojy of the tensor, with j’= 1, 2, 3, are usually presented as 


e jr 


T m 
1 jf ’ 


(9.227) 


where / Af) is the so-called Maxwell stress tensor. 



(9.228) 


so that the whole symmetric energy-momentum tensor may be conveniently presented in the following 
symbolic way: 


gap 


' u 

<- 

S/c 

t 

S 


t (M) 

L ■; 

c 



U 


J 


(9.229) 


The physical meaning of this tensor may be revealed in the following way. Considering Eq. 
(221) just as the definition of tensor ff 1 *, 73 and using the 4-vector fonn of Maxwell equations, given by 
Eqs. (127) and (129), it is straightforward to verify an extremely simple result for the 4-derivative of the 
symmetric tensor: 


8.0°" =-F n j r . 


(9.230) 


This expression is valid in the presence of the electromagnetic field sources, e.g., for any system of 
charged particles and the field they have created. Of these 4 equations (for 4 values of index /?), the 
temporal one (with fi = 0) may be simply expressed via the energy density (225) and Poynting vector 
(226): 


du 

dt 


+ V-S = 


-j E, 


(9.231) 


while 3 spatial equations (with fi=j= 1,2, 3) may be presented in the fonn 


dt c 2 


I 


d (M) _ 
Dr/*' 


-(/?E + j x B), . 


(9.232) 


Integrated over a volume V limited by surface S, with the account of the divergence theorem, Eq. 
(23 1) returns us to the Poynting theorem (6. 103): 


73 In this way, we are using Eqs. (214) and (221) just as a useful guesses, leading to the definition of & and may 
leave their strict justification for more serious field theory courses. 
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(9.233) 


while Eq. (232) yields: 74 

3 

r ( pdA r , with f = pE + j x B, (9.234) 

j J-' s 

where dAj = n/IA = n,d 2 r is the j th component of the elementary area vector dA = ndA = nd 2 r that is 
normal to volume’s surface, and directed out of the volume - see Fig. 17. 


5 S , 

v + f 

dt c 2 



Fig. 9.17. Force t/F exerted on a boundary 
element dA of volume V occupied by the field. 


Total 

momentum’s 

dynamics 


Since, according to Eq. (5.10), vector f is nothing else than the density of volume-distributed 
forces applied from the field to the particles, we can use the 2 nd Newton law, in its relativistic form 
(144), to rewrite Eq. (234), for a stationary volume V, as 



(9.235) 


Force via 
the Maxwell 
tensor 


where p par t is the total mechanical (relativistic) momentum of all particles in volume V, and vector F is 
defined by its Cartesian components: 



(9.236) 


Equations (235)-(236) are our main new results. The first of them shows that vector 

Electro- 
magnetic 
field’s 
momentum 



(9.237) 


may be interpreted as the density of momentum of the electromagnetic field (per unit volume). This 
classical relation is consistent with the quantum-mechanical picture of photons being considered as 
ultrarelativistic particles, with momentum magnitude &c, because then the total flux of the momentum 
carried by photons through a unit normal area per unit time may be presented as either SJc or as g n c. It 
also allows us to revisit the Poynting vector paradox that was discussed in Sec. 6.7 - see Fig. 6.9 and its 


74 Just like the Poynting theorem (233), Eq. (234) may be obtained directly from the Maxwell equations, without 
resorting to the 4-vector formalism - see, e.g., Sec. 8.2.2 in D. J. Griffiths, Introduction to Electrodynamics, 3 ld 
ed., Prentice-Hall, 1999. However, the derivation discussed above is preferable, because it shows the wonderful 
unity between the laws of conservation of energy and momentum. 
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discussion. As has been emphasized at this discussion, vector S = ExH in this case does not correspond 
to any measurable energy flow. However, the corresponding momentum (237) of the field is not only 
real, but may be measured by the recoil impulse 75 it gives to the field sources (say, to a magnetic coil 
inducing field H and to the capacitor plates creating field E). 

Now let us turn to our second result, Eq. (236). It tells us that the 3x3-element Maxwell stress 
tensor complies with the general definition of the stress tensor 76 characterizing force F exerted by 
external forces on the boundary of a volume, in this case occupied by the electromagnetic field (Fig. 
17). 77 Let us use this important result to analyze two simple examples for static fields. 

(i) Electrostatic field’s effect on a perfect conductor. Since Eq. (235) has been derived for a free 
space region, we have to select volume V outside the conductor, but we may align one of its faces with 
conductor’s surface (Fig. 18). 


z 

-> 



Fig. 9.18. Electrostatic field near conductor’s surface. 


From Chapter 2, we know that electrostatic field has to be perpendicular to conductor’s surface. 
Selecting axis z in this direction, we have E x = E y =0, E z = ±E, so that only diagonal components of 
tensor (228) do not equal zero: 


r (M) 




(9.239) 


Since the elementary surface area vector has just one nonvanishing component, dA z , according to Eq. 
(236), only the last component (that is positive regardless of the sign of E) gives a contribution to the 
surface force F. We see that the force exerted by the conductor (and eventually by external forces that 
hold the conductor in its equilibrium position) on the field is normal to the conductor and directed out of 
the field volume: dF z > 0. Hence, by the 3 rd Newton law, the force exerted by the field on conductor’s 
surface is directed toward the field-filled space: 


dE 


surface 


-dF = -—E 2 dA . 
2 


(9.240) 


This important result could be obtained by simpler means as well. For example, one could argue, 
quite convincingly, that the local relation between the force and field should not depend on the global 


75 This impulse is sometimes called the hidden momentum ; this term makes sense if the field sources have finite 
masses, so that their velocity change at the field variation is measurable. 

76 See, e.g., CM Sec. 7.2. 

77 Note that the field-to-particle interaction gives a vanishing contribution into the net integral, as it should for any 
internal interaction between internal parts of a system. 
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configuration creating the field, and consider a planar capacitor (Fig. 2.2) with surfaces of both plates 
charged by equal and opposite charges of density <j = ±SqE. According to the Coulomb law, the charges 
should attract each other, pulling each plate toward the field region, so that Maxwell-tensor result gives 
the correct direction of the force. The force’s magnitude (240) can be verified either by the direct 
integration of the Coulomb law, or by the following simple reasoning. In the plane capacitor, field E z = 
o/sq is equally contributed by two surface charges; hence the field created by the negative charge of the 
counterpart plate (not shown in Fig. 18) is E. = dls^, and the force it exerts of the elementary surface 
charge dQ = cjclA of the positively charged plate is dF = ExlQ = <j dAUso = s^E 2 dAI2, in accordance 
with Eq. (240). 78 

Quantitatively, even for such high electric field as E = 3 MV/m (close to the electric breakdown 
in air), the “negative pressure” (dF/dA) given by Eq. (240) is of the order of 500 Pa (N/nT), i.e. below 
one thousandth of the ambient atmospheric pressure (1 bar « I O' Pa). Still, these forces may be 
substantial in some cases, especially in good dielectrics (such as high-quality SiC> 2 , grown at high 
temperature, which is broadly used in integrated circuits) that can withstand fields up to ~10 9 V/m. 

(ii) Static magnetic field’s effect on its source 79 - say, solenoid’s wall or superconductor’s 
surface (Fig. 19). 


Fig. 9.19. Static magnetic field near a 
current-carrying surface. 


z 

-> 



With the choice of coordinates shown in Fig. 19, we have B x = ±B, B v = B : = 0, so that the 
Maxwell stress tensor (228) is diagonal again: 


<">=— b 2 , t {M) = fi M] = —B 2 

“ yy 



(9.241) 


However, but since for this geometry only dA : differs from 0 in Eq. (236), the sign of the resulting force 
is opposite to that in electrostatics: dF z <0, and the force exerted by the magnetic field upon the 
conductor’s surface, 


Magnetic 

field’s 

push 


dF = -dF 

U1 surface U1 z 


— B 2 dA, 
2Ao 


(9.242) 


78 By the way, repeating these arguments for a plane capacitor filled with a linear dielectric, we may 
readily see that Eq. (240) may be generalized for this case by replacing so for s. The similar replacement 
(jUq — > ju) is valid for Eq. (242) in a linear magnetic medium. 

79 The causal relation is not important here. Especially in the case of a superconductor, the magnetic field may be 
induced by another source, with the surface supercurrent j just shielding the superconductor’s bulk from its 
penetration - see Sec. 6. 
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corresponds to a positive pressure. For good laboratory magnets (B ~ 10 T), this pressure is of the order 
of 4x 10 7 Pa « 400 bars, i.e. is very substantial, so the magnets require solid mechanical design. 

The direction of force (242) could be also readily predicted elementary magnetostatics 
arguments. Indeed, we can imagine the magnetic field volume limited by another, parallel wall with the 
opposite direction of surface current. According to the starting point of magnetostatics, Eq. (5.1), such 
surface currents of opposite directions have to repulse each other - doing that via the magnetic field. 

Another explanation of the fundamental sign difference between the electric and magnetic field 
pressures may be provided on the electric circuit language. As we know from Chapter 2, the potential 
energy of the electric field stored in a capacitor may be presented in two equivalent forms, 


U 


e 


cv 2 

2 


el 

2 C ' 


Similarly, the magnetic field energy of in an inductive coil is 


U 


m 


LI 2 _ ® 2 

~Y~ ~2L' 


(9.243) 


(9.244) 


If we do not want to consider the work of external sources on a virtual change of the system dimensions, 
we should use the latter forms of these relations, i.e. consider a galvanically detached capacitor ( Q = 
const) and an externally-shorted inductance (® = const). 80 Now if we let the electric field forces (240) 
drag capacitor’s plates in the direction they “want”, i.e. toward each other, this would lead to a reduction 
of the capacitor thickness, and hence to an increase of capacitance C, and hence to a decrease of U e . 
Similarly, for a solenoid, allowing pressure (242) to move its walls would lead to an increase of the 
solenoid volume, and hence of its inductance L, so that the potential energy U m would be also reduced - 
as it should be. It is remarkable (actually, beautiful) how do the local field fonnulas (240) and (242) 
“know” about these global circumstances. 


Finally, let us see whether the major results (237) and (242), obtained in this section, match each 
other. For that, let us return to the normal incidence of a plane, monochromatic wave from free space on 
the plane surface of a perfect conductor (see Fig. 7.8 and its discussion), and use those results to 
calculate the time average of pressure dF &w f ac JdA imposed by the wave on the surface. At elastic 
reflection from conductor’s surface, electromagnetic field’s momentum retains its amplitude but 
changes its sign, so that the momentum transferred to a unit area of the surface (i.e. average pressure) is 


dF 


surface 


dA 


= 2 eg 


incident 


= 2 c 


A 


incident 


= 2 c 


1 E co H *co E " H * 


a> 


(9.245) 


where E m and 77® are complex amplitudes of the incident wave. Using relation (7.7) between these 
amplitudes (for s = s o and // = //o giving E co = cB ( „), we get 


^surface ^ „ D B OJ B co 

— 77 — = ~ cB a = 

dA c jU 0 ju 0 


(9.246) 


80 Of course, this condition may hold “forever” only for solenoids with superconducting wiring, but even in 
normal-metal solenoids with practicable inductances, the flux relaxation constants L/R may be rather large 
(practically, up to a few minutes), quite sufficient to carry out force measurements.. 
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On the other hand, as was discussed in Sec. 7.4, at the surface of the perfect mirror the electric 
field vanishes while the magnetic field doubles, so that we can use Eq. (242) with B — » B(t) = 2Re[7? fJ e{- 
iojt}]. Averaging the pressure over time, we get 


^^surface 

clA 




(9.247) 


i.e. the same result as Eq. (246). 

For the physics intuition development, it is useful to estimate the electromagnetic radiation 
pressure’s magnitude. Even for the relatively high wave intensity S n of 1 kW/m (close to that of the 
direct sunlight at Earth’s orbit), pressure 2 cg n = 2 SJc is somewhat below 10' 5 Pa ~ 1CT 10 bar. Still, this 
extremely small effect was experimentally observed (by P. Lebedev) as early as in 1899, giving one of 
the most important confirmations of Maxwell’s theory. 


9.9. Exercise problems 

9.1 . Use the nonrelativistic Doppler effect picture to derive Eq. (4). 

9.2 . Show that two successive Lorentz space/time transfonns in the same direction, with 
velocities u ’ and v, are equivalent to a single transform with velocity a given by Eq. (25). 

9.3 . N + 1 reference frames, numbered by index n (taking values 0, 1, . . ., N ), move in the same 
direction as a particle. Express the particle’s velocity in frame n = 0 via its velocity My in frame number 
N and the set of velocities v„ of frame number n relative to the frame number (n - 1). 

9.4 . A spaceship, moving with constant velocity v directly from the Earth, sends back brief 
flashes of light with period A t s - as measured by spaceship's clock. Calculate the period with which 
Earth's observers receive the signals - as measured by Earth's clock. 

9.5 . From the point of view of reference frame O', a straight rod, parallel to axis x\ is moving, 
without rotation, with constant velocity u' directed along axis y'. The reference frame 0' is itself moving 
relative to another ("lab") reference frame 0, with similarly oriented axes, with a constant velocity v 
along axis x, also without rotation - see Fig. on the right. Calculate: 

(i) the direction of rod’s velocity, and 

(ii) the orientation of the rod on the [x, y] plane, 

as observed from the lab reference frame. Is the velocity perpendicular to the rod? 

9.6 . A relativistic particle moving with velocity u decays into two particles with zero rest mass. 

(i) Calculate the smallest possible angle between the decay product velocities (in the lab frame). 

(ii) What is the largest possible energy of one product particle? 

9.7. Starting from the rest at t = 0, a spaceship moves with a constant acceleration, as measured 
in its instantaneous rest frame. Find its displacement x(t) from the starting point, as measured from the 
lab frame, and interpret the result. 
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9.8 . Calculate the first relativistic correction to the frequency of a harmonic oscillator as a 
function of its amplitude. 

9.9 . A particle with rest mass m decays into two particles, with rest masses m\ and m 2 . Calculate 
the total energy of the first decay product, in the rest frame of the decayed particle. 


9.10 . A relativistic particle, propagating with velocity v outside of external fields, decays into 
two photons. 81 Calculate the angular dependence of the probability of photon detection. 


9.1 1 . Photon with wavelength A is scattered by an 
electron, initially at rest. Considering the photon as an 
ultrarelativistic particle (with the rest mass m = 0), find 
wavelength A’ of the scattered photon as a function of the 
scattering angle a - see Fig. on the right. 82 


P _--rO 
m e 



9.12 . Calculate the threshold energy of a y-photon for the reaction 

y + p — » p + 7i°, 

if the proton was initially at rest. 

2 2 

Hint : For protons m p c « 938 MeV, while for neutral pions m K c « 135 MeV. 


9.13 . A relativistic particle with energy 3 and rest mass m collides with a similar particle, 
initially at rest in the laboratory frame. Find: 

(i) the final velocity of the center of mass of the system, in the lab frame, 

(ii) the total energy of the system, in the center-of-mass frame, and 

(iii) the final velocities of both particles (in the lab frame), if they move along the same 
direction. 


9.14 . A “primed” reference frame moves with the reduced velocity (3 = v/c = n A /? relative to the 
“lab” frame. Use Eq. (109) to spell out components T’ 00 and T ,0/ (with j = 1, 2, 3) of an arbitrary 
contravariant 4-tensor T yS . 

9.15 . Static fields E and B are uniform but arbitrary (both in magnitude and in direction). What 
should be the velocity of an inertial reference frame to have the vectors E ’ and B ’, observed from that 
frame, parallel? Is this solution unique? 

9.16 . Two charged particles, moving with the same constant velocity u, are 
offset by distance R = {a, b | (see Fig. on the right), as measured in the lab frame. 

Calculate the forces between the particles - also in the lab frame. 


9 - 



a 


81 Such a decay may happen, for example, with a neutral pion. 

82 This the famous Compton scattering problem. 
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9.17 . Each of two very thin, long, parallel beams of electrons of the same velocity u carries 
electric charge of density X per unit length (as measured in the coordinate frame moving with electrons). 

(i) Calculate the distribution of the electric and magnetic fields in the system (outside the 
beams), as measured in the lab frame. 

(ii) Calculate the interaction force between the beams (per particle) and the resulting 
acceleration, both in the lab frame and in the system moving with the electrons. Compare the results and 
give a brief discussion of the comparison. 

9.18 . Spell out the Lorentz transform of the scalar potential and the vector potential components, 
and use the result to calculate the potentials of a point charge q, moving with a constant velocity u, as 
measured in the lab reference frame. 

9.19 . Calculate the scalar and vector potentials created by a time-independent electric dipole p, 
as measured in a reference frame which moves relatively to the dipole with a constant velocity v, with 
the shortest distance (“impact parameter”) equal to b. 

9.20 . Calculate the scalar and vector potentials created by a time-independent magnetic dipole 
m, as measured in a reference frame which moves relatively to the dipole with a constant velocity v « 
c, with the shortest distance (“impact parameter”) equal to b. 

9.21 . Assuming that the magnetic monopole does exist and has magnetic charge g, calculate the 
change AO of magnetic flux in a superconductor ring due to the passage of single monopole through it. 
Evaluate AO for the monopole charge conjectured by Dirac, g = ng 0 = nd/rfi/e), where n is an integer; 
compare the result with the magnetic flux quantum Oo (6.55) and discuss their relation. 

9.22 / Calculate the trajectory of a relativistic particle in a uniform electrostatic field E for the 
case of arbitrary direction of its initial velocity u(0), using two different approaches - one of them 
different from the approach used in Sec. 6 for the case u(0) _L E. 

9.23 . A charged relativistic particle with velocity u perfonns planar cyclotron rotation in a 
uniform external magnetic field B. How much would the velocity and orbit radius change at a slow 
change of the field to a new magnitude B '? 

9.24 . * Analyze the motion of a relativistic particle in uniform, mutually perpendicular fields E 
and B, for the particular case when E is exactly equal to cB. 

9.25 * Find the law of motion of a relativistic particle in unifonn, parallel, static fields E and B. 

9.26 . Neglecting relativistic effects, calculate the smallest voltage V that has to be applied 
between the anode and cathode of a magnetron (see Fig. 13 and its discussion) to enable electrons to 
reach the anode in the absence of electron-electron interactions and collisions with the residual gas 
molecules. You may model the cathode and anode as two coaxial round cylinders, of radii R i and R 2 , 
respectively, assume that the magnetic field B, directed along their common axis, is uniform, and 
neglect the initial velocity of the electrons emitted by the cathode. (After the solution, estimate the 
validity of the last assumption for reasonable values of parameters.) 
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9.27 . A charged, relativistic particle has been injected into a uniform electric field that oscillates 
in time with frequency <x>. Calculate the time dependence of the particle’s velocity, as observed from a 
lab frame. 

9.28 . Analyze motion of a nonrelativistic particle in a region where the electric and magnetic 
fields are both constant and uniform, but not necessarily parallel or perpendicular to each other. 

9.29 . A static distribution of electric charge in otherwise free space has created a time- 
independent distribution E(r) of the electric field. Use two different approaches to express the energy 
density u ' and the Poynting vector S’, as observed in a reference frame moving with constant velocity v, 
via the components of vector E. In particular, is S equal to {-\u ’)? 

9.30 . A plane wave, of frequency co and intensity S, is normally incident on a perfect mirror, 
moving with velocity v in the same direction as the wave. 

(i) Calculate the reflected wave’s frequency, as observed in the lab reference frame, and 

(ii) use the Lorentz transfonn of the fields to calculate the reflected wave’s intensity 
- both as observed from the lab reference frame. 

9.31 . Carry out the second task of the previous problem by using the relations between wave’s 
energy, power, and momentum. 

Hint : As a byproduct, this approach should also give you the pressure exerted by the wave on the 
moving mirror. 

9.32 . Consider the simple model of plane capacitor charging by a lumped 
current source, shown in Fig. on the right, and prove that the momentum given by 
the constant, uniform external magnetic field B to the current-carrying conductor 
is equal and opposite to the momentum of the electromagnetic field that current 
I(t) builds up in the capacitor. (You may let the capacitor be planar and very 
broad, and neglect the fringe field effects.) 


9.33 . Consider an electromagnetic plane wave packet propagating in free space, with the electric 
field represented as the Fourier integral 

+oo 

E(r, t) = Re J Y, k e ¥k dk, with y/ k =kz- a> k t, and co k = c\k\ . 

-oo 

Express the full linear momentum (per unit area of wave’s front) of the packet via the complex 
amplitudes E/ c . Does the momentum depend on time? (In contrast with Problem 7.7, in this case the 
wave packet is not necessarily narrow.) 

9.34 . Calculate the pressure exerted on well-conducting walls of a waveguide with rectangular 
(axb) cross-section by a wave propagating along it in the fundamental (//io) mode. Give an 
interpretation of the result. 
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Chapter 10. Radiation by Relativistic Charges 

In this chapter, we return to the electromagnetic wave radiation by moving charges, because the review 
of the special relativity background in the previous chapter enables an analysis of the radiation effects 
for arbitrary speed of the charged particle. After an analysis of such important particular cases as 
synchrotron radiation and “Bremsstrahlung” (brake radiation), we will discuss the apparently 
unrelated effect of Coulomb losses, which nevertheless will lead us to such important phenomena as the 
Cherenkov radiation and transitional radiation. In the end of the chapter, I will briefly review the 
effects of back action of the emitted radiation on the emitting particle, whose analysis reveals some 
limitations of classical electrodynamics. 


10.1. Lienard-Wiechert potentials 

A convenient starting point for the discussion of radiation by relativistic moving charges is 
provided by Eqs. (8.17) for retarded potentials. In free space these fonnulas are reduced to 


</>(r,t) = 


1 rp(r',t-R/c) 
47t£ n R 


df, A(r,0 = ^{ 
4k j 


j(r \t-R lc) 
R 


d 3 r' 


( 10 . 1 ) 


Here R is the magnitude of the vector, 


R = r -r' , 


( 10 . 2 ) 


that connects the source point r’ to the observation point r. As a reminder, Eqs. (1) were derived from 
the Maxwell equations without any restrictions, and are very convenient for situations with continuous 
distribution of charge and current. On the other hand, for point charges, with delta- functional p and j, it 
is more convenient to recast these relations into a simpler fonn that would not require the integration 
over the r ’ space. 


This reduction, however, requires care. Indeed, for a single point charge q moving with velocity 
u, such integration of Eqs. (1), if carried out naively, would yield the following apparent result: 


d) 


_i <i_ ie _ Mo qc . 

4ks 0 R r ’ c 4 k R r ’ 


A(r,f) 


Ao gu £ 
4k R r ’ 


[WRONG!] 


(10.3) 


where index r marks the variables to be calculated at time t - R,/c. This is a good example how the 
science of relativity (even the special one :-) cannot be taken too lightly. Indeed, 4-vectors (9.84)-(9.85), 
formed from potentials (3), would not obey the Lorentz transform rule (9.91), because distance R, also 
depends on the reference frame it is measured in. 

In order to correct the error, we need, first of all, to specify what exactly is R r for a point charge. 
Evidently, in this case, only one space-time point {r’, t’} may contribute to integrals (1) for any 
observation point {r, t} . The point should be found from the retardation condition t’ = t — R,Jc, i.e. 

c (t-t') = \r(t)-r'(t')\. (10.4) 


Figure 1 depicts the graphical solution of this self-consistency equation as the point of intersection of 
the light cone of the observation point (see Fig. 9.9 and its discussion) and the trajectory of the charged 
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particle. 1 As in Eq. (3), I will use index r to mark all variables corresponding to the retarded point {r’, 
t’} that satisfies Eq. (4); for example, t’ = t r , c(t - t r ) = R r (see Fig. 1), u{r t r ) = u r , etc, as measured in 
the “lab” reference frame - generally, any inertial frame that moves with the same velocity as the 
observation point at the moment t we are considering. 



Now let us write Eqs. (1) for a point charge in another inertial reference frame 0 ’, whose velocity 
(as measured in the lab frame) coincides, at moment t r , with the same velocity (u,) of the point charge. 
In that frame the charge rests, so that 




g 

4 7T£ q R' 


A' = 0, 


(10.5) 


but let us remember that this R ’ may not be equal to A, because the latter distance is measured in the 
“lab” reference frame. Let us use the identity l/so = jUqc" to rewrite Eq. (5) in the fonn of components of 
a 4-vector similar in structure to Eq. (3): 

£ = 2%^, A' = 0 . (10.6) 

c \n R' 


Now it is easy to guess the correct answer for the whole 4-potential: 


A a 


jU 0 cu a 

4 n q u pR p 


(10.7) 


where (just as a reminder), A a = {(/)lc, A },u a = y{c, u},and R a is a 4-vector of the event distance, formed 
similarly to that of a single event - cf. Eq. (9.48): 


R a = {c(f-f'),R} = {c(t-t'),r -r'}. 


( 10 . 8 ) 


Indeed, we need the 4-vector A a that would: 

(i) obey the Lorentz transform, 

(ii) have its spatial components Aj scaling as u h and 

(iii) be reduced to the correct result (5) in the reference frame moving with the charge. 


1 As Fig. 1 shows, there is always another point {r”, t” }, with t” > t, that is formally also a solution of Eq. (4), 
but it does not fit Eqs. (1), because the field induced at that point would violate the causality principle. 
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Lienard- 

Wiechert 

potentials 


Formula (7) evidently satisfies all these requirements, because the scalar product in its denominator is 
just 

u p R p = y{c- u}- {c(t -F),R} = y[c 2 (t -t')- u -R] = yc(R- P -R) = ycR( 1 - p -n), (10.9) 

where n = R/R is a unit vector in the observer’s direction, (3 = u/c is the normalized velocity of the 

2 2 1/2 

particle, and y = 1/(1- idc ) . 2 In the reference frame of the charge (in which p = 0 and r= !)» 
expression (9) is reduced to cR , so that Eq. (7) is correctly reduced to Eq. (6). Now let us spell out 
components of Eq. (7) in the lab frame (in which t’ = t r and R = R r ): 


(10.10a) 


(10.10b) 


These formulas are called the Lienard-Wiechert potentials . 3 In the nonrelativistic limit, they 
coincide with the naive guess (3), but in the general case include the additional factor (1 - Pn) in the 
denominator, which describes the apparent increase of the effective charge density of the source due to 
the apparent change of distance R, at /3~ 1. In order to understand its origin, let us consider a simple ID 
model of the radiation: a uniformly charged rod, of length /, moving directly toward an observer located 
at point r, with a constant speed u (Fig. 2). As a result of this motion, the observer may measure the 
field (1) induced by the rod, within a certain time interval [4 ta rt, Atop] - 
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Fig. 10.2. Geometric effect behind 
factor (1 - p-n) in the Lienard-Wiechert 
potentials. 


That trailing end of this field pulse, observed at t = 7 stop , is emitted by the far (in Fig. 2, leftmost) 
end of the rod at moment t ’ stop . Due to the limited speed of the rod, u< c, the moment t ’ stop comes earlier 
than the moment t ’ sta rt, at which the front end of the rod emits the field that starts the observed pulse. 
During the positive time interval (i’ star t - T’st op ), the rod passes an additional distance u(t\ Vdri - t ’ slop ) - see 
the bottom panel of Fig. 2. Using the evident relations shown on each of the two panels of Fig. 2 to 
express r, and requiring them to give the same result, we get the following relation 

c(t s top - kop) = «(ktart - kop) + 1 + C (^start “ ^'start ) • ( 10 - 1 1) 


2 Note the following identities: f = 1/(1- // 2 ) and (/ - 1) = /3 2 /( 1- p 2 ) = f fd which may be very handy for the 
relativity-related algebra. 

3 They were derived in 1898 by A.-M. Lienard and (apparently, independently) in 1900 by E. Wiechert. 
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Using it to express the difference A t\u) = U star t - /start > 0 in the limit when t stop — » t sta n, i.e. when the 
observed radiation pulse is short, we get 

A t'(u) = — - — = ^ C = ; where A4(0) = — , (10.12) 

c-u \- (3 l- (3 c 


is a factor of 1/(1 - jB) smaller than what is would be at negligible source speed. Hence the time interval 
between the retarded moments t r for two ends of the rod is compressed as u is increased. Since the total 
charge of the rod does not depend on u, its linear charge density is increased, and the field in the 
observation point is increased accordingly. Somewhat counter-intuitively, Eq. (12) shows that this field 
re-normalization is independent of the source size /, and hence takes place even in the limit / — > 0, e.g., 
for a point source. 4 

So, the 4-vector formalism has provided a big help for the calculation of field potentials. Now, 
the electric and magnetic field corresponding to the potentials may be found by the plugging Eqs. (10) 
into the general fonnulas (6.106). This operation should be also perfonned very carefully, because Eqs. 
(6.106) require the differentiation over the coordinates {r, t) of the observation point, while we want the 
fields to be expressed via particle’s velocity u r = (dr’/dt ’),- that participates in Eqs. (10). In order to find 
the relation between derivatives over t and t ’, let us differentiate Eq. (4), rewritten as 

R, = c(t -t r ), (10.13) 

over t and t r . In order to calculate derivative dR,Jdt r , let us first differentiate identity R = R-R: 

2R r d ^ = lR r - 8 ^. (10.14) 

dt r 8t r 


Since <9R Jdt r = d(r - r ’)/dt r = -dr 7 dt r = -u, Eq. (14) yields 


8R r R, dR, ( ^ 

dt„ R, 8t„ V h 


(10.15) 


Now let us differentiate the same function R, over t, keeping r fixed. On one hand, Eq. (13) yields 


dR r 

dt 


- c-c- 


dt 


(10.16) 


On the other hand, according to Eq. (4), if r is fixed, t’ is a function of t alone, so that, using Eq. (15), 
we may write 


cR 8R r dt r / \ dt r 

— = — ! — - = -in • u — - . 
dt dt,. dt v ,r dt 


(10.17) 


Requiring Eqs. (16) and (17) to give the same result, we get the same factor that participates in the 
Lienard-Wiechert potentials (10) and Eq. (12): 


4 Note that this time compression effect (linear in f3) has nothing to do with the Lorentz time dilation (9.21), 
which is quadratic in ft. (Indeed, all our arguments above are referred to the same, lab frame.) Rather, it is close in 
nature to the Doppler effect. 
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dt r _ c 
dt c - (n • u) r 


' 1 A 

1-P-n 


(10.18) 


This relation may be readily interpreted - at least semi-quantitatively. At fixed r, variation dt of 
the observation time corresponds to a small vertical shift of the light cone in Fig. 2, while dt r is the 
corresponding shift of the retarded time t r , i.e. of the point where the world line r ’((’) crosses the light 
cone at the observation point r(t). It is evident from that figure that if the particle does not move (i.e. its 
world trajectory in a vertical straight line), then dt,- = dt. On the other hand, if the particles moves fast 
(with speed u « c) toward the observation point, its world line crosses the light cone at a small 
(“grazing”) angle, so that dt,- » dt, in accordance with Eq. (18). 

Since the retarded time t r , as the solution of Eq. (3), depends not only on the observation time t, 
but also the observation point r, so we also need to calculate its spatial derivative - the gradient in r- 
space. A calculation, absolutely similar to that carried above, yields 


Vt=- 


n 


’(l-P-n). 


(10.19) 


Electric 
and 
magnetic 
fields of 
relativistic 
particle 


Using Eqs. (6.106), (18) and (19), the calculation of fields from Eqs. (10) is straightforward but 
tedious, and is left for reader’s exercise. For the electric field, the result is: 


E - ^ 

n-p nx{(n-p)xp} 


4 7te 0 

y 2 (l-p-n) 3 i? 2 (1 - p • n) 3 ci? 

r 


(10.20a) 


The only good news about this uncomfortably bulky result is that a similar differentiation gives 
essentially the same formula for the magnetic field, which may be expressed via Eq. (20a): 5 


E 

1 

B = n x — , 

i.e. H = — n xE . 

c 

7 


(10.20b) 


Thus the magnetic and electric fields are always perpendicular to each other, and related just as in a 
plane wave - cf. Eq. (7. 6), 5 6 with the only difference that now vector n r may be a function of time. 


As a sanity check, let us use Eq. (20a) as an alternative way to find the electric field of a charge 
moving without acceleration, i.e. unifonnly, along a straight line - see Fig. 9.11 (reproduced in Fig. 3) 
and its discussion in Sec. 5. (This example will also exhibit the challenges of practical application of the 
Lienard-Wiechert formulas.) In this case vector |3 does not change in time, so that the second tenn in Eq. 
(20a) vanishes, and all we need to do is to spell out the Cartesian components of the first tenn. Let us 
select the coordinate axes and time origin in the same way as shown in Fig. 3, and make a clear 
distinction between the actual position, r ’ (t) = {ut, 0, 0} of the charged particle at the instant t we are 


5 An alternative way to derive Eqs. (20) is to plug the 4-vector of potentials, given by Eq. (7), into Eq. (9.124) to 
calculate the field strength tensor. This calculation yields 

F ap = Aoi_l d_ R a u p -R p u a 

4 n u y R y dr u g R s 

Now the elements of this tensor may be identified with fields components in accordance with Eq. (9.125). 

6 Superficially, Eq. (20b) contradicts the electrostatics where B should vanish while E stays finite. However, note 
that according to the Coulomb law for a point charge, in this case E = En = En r , so that B cc n,.xE oc n,xn, = 0. 
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considering, and its retarded position r ’(t r ), where t r is the solution of Eq. (13), i.e. the moment when the 
particle’s field, moving with the speed of light, reaches the observation point r. In these coordinates 

P = {/?, 0, 0}, r = {0, 0, b\ r'(0 = {ut r , 0, 0}, n, = {cos<9, 0, sin#}, (10.21) 


with cos# = -ut’/Rr, so that [(n - P) A ], = -ut’/R r - /?, and for the longitudinal component of the electric 
field, Eq. (20a) yields 


q 

- ut r / R - P 

q 

- ut r - PR 

4 KS 0 

_/ 2 (l - P -n) 3 i? 2 J 

4 7T£ 0 

r(l-p-n) 3 i? 3 _ 


( 10 . 22 ) 



Fig. 10.3. Geometry of the linearly 
moving charge problem. 


But according to Eq. (13), product j3R r may be presented as J3c(t - t r ) = u(t - t r ). Plugging this 
expression into Eq. (22), we may eliminate the explicit dependence of E x on time t 


E 


X 


q -ut 

4x£ 0 y 2 [(1 - P • n)i?] 3 ■ 


(10.23) 


The nonvanishing transverse component of the field also has a similar fonn: 


E = -$- 

4 ns n 


sin# 

y 2 (\ - P • n) 3 R 1 


q b 

4ns, r 2 [(l-p.n)^’ 


(10.24) 


while E z = 0. Hence, the only combination of t r and R r we still need to calculate is [(1 - p-n)f?] r . From 
Fig. 3, p-n r = ffco%6 = -J3ut’/R r , so that (1 - P-n)f? r = R, + put,- = c{t - t r ) + eft t r = ct - ct r lf. What 
remains is to find time t r from the self-consistency equation (13) that in our case (Fig. 3) takes the form 

R 2 = c \t-t r ) 2 =b 2 +(ut,.) 2 . (10.25) 


After solving this quadratic equation (with the appropriate negative sign before the square root, in order 
to get t r < t), 


t r = yt ■ 




r /2 


y t \u y t + 

c 


h 2 ) 1/2 , 


(10.26) 


we obtain a simple result: 


[(l-p.n)4 


c 



(u 2 r 2 t 2 



(10.27) 


so that the electric field components are 
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q yut 

4^o (b 2 + r 2 u 2 t 2 f 2 ’ 


= _q ft 

4^o (, b 2 +y 2 u 2 t 2 J 12 ’ 


E . =0 


(10.28) 


These are exactly Eqs. (9. 139), 7 which had been obtained in Sec. 9.5 by simpler means, without 
the necessity to solve the self-consistency equation for t r . However, that alternative approach was 
essentially based on the inertial motion of the particle, and cannot be used in problems in which particle 
moves with acceleration. In those problems, the second term in Eq. (20a), describing wave radiation, is 
essential and most important. 


10.2. Radiation power 

Let us calculate the angular distribution of particle’s radiation. For that, we need to return to use 
Eqs. (20) to find the Poynting vector S = ExH, and in particular its component S„ = S-n,, at large 
distances R from the particle. Following tradition, let us express the result as the radiated energy per 
unit solid angle per unit time interval dt r of the radiation (rather than its measurement), using Eq. (18): 


d P _ dd 
d£l dQdt r 



[j? 2 (ExH)-ii (l-p n)] . 


(10.29) 


At sufficiently large distances from the particle, i.e. in the limit R — » oo, the contribution of the first 
(essentially, the Coulomb field) term in the square brackets of Eq. (20a) vanishes as l/R , so that we get 
a key fonnula valid for an arbitrary law of particle motion: 8 * 

Angular 
density of 
radiation 
power 

Now, let us apply this important result to some simple cases. First of all, Eq. (30) says that a 
charge moving with constant velocity p does not radiate at all. This might be expected from our analysis 
of this case in Sec. 9.5, because in the reference frame moving with the charge it produces only the 
Coulomb electrostatic field, i.e. no radiation. 

Next, let us consider a linear motion of a point charge with a nonvanishing acceleration - 
evidently directed along the motion line. With the coordinate axes directed as shown in Fig. 4a, each of 
the vectors involved in Eq. (30) has at most two nonvanishing Cartesian components: 

n = {sin#, 0, cos#}, p = { 0, 0, /?}, p = {o, 0, /?}. (10.31) 


dP Z 0 q 1 

n x 

(n - P) x (3 

2 

dQ. (4 n) z (1-n-p) 5 


(10.30) 


where 6 is the angle between the directions of particle’s motion and radiation propagation. Plugging 
these expressions into Eq. (30) and perfomiing the vector multiplications, we get 




dP_ 

~dO. 


sin" 6 


(4 ny (1 — /? cos 0) 


(10.32) 


7 A similar calculation of magnetic field components from Eq. (20b) gives the results identical to Eqs. (9.140). 

8 If the direction of radiation, n, does not change in time, this formula does not contain the observation point r. 

Hence, from this point on, index r may be safely dropped for brevity, though we should always remember that p 

in Eq. (30) is the reduced velocity of the particle at the instant of radiation’s emission, not detection. 
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Fig. 10.4. Radiation at linear 
acceleration: (a) geometry of 
the problem, and (b) the last 
fraction in Eq. (32) as a 
function of angle 9. 


Figure 4b shows the angular distribution of this radiation, for three values of particle’s speed. If 
it is relatively low (/?« 1), the denominator in Eq. (32) is close to 1 for all observation angles 6, so that 
the angular distribution of the radiation power is close to sin 6 - just as it follows from the general 
nonrelativistic fonnula (8.26). Flowever, as the velocity is increased, the denominator is less than 1 for 
6 < nil, i.e. for the forward-looking directions, and is larger than 1 for back directions. As a result, the 
radiation toward particle’s velocity is increased (somewhat counter-intuitively, regardless of the 
acceleration sign!), while that in the back direction is suppressed. For ultrarelativistic particles (/?— » 1), 
this trend is enormously exacerbated, and radiation to very small forward angles dominates. In order to 
describe this main part of the distribution, we may expand the trigonometric functions of 0, participating 
in Eq. (32), into the Taylor series in small 6, and keep only their leading terms: sin# « 6, cos 6~ 1 -6 
2 /2, so that (1 - /fcos#) « (1 + fO 2 )! 2f. The resulting expression, 


dr 

~dQ 



ft 2 / 


m- 

( i +^ 2 # 2 ) 5 ’ 


for y » 1 , 


(10.33) 


describes a narrow distribution of radiation, with a maximum at angle 


0, = 2-« l 
2 y 


(10.34) 


Note that due to the axial symmetry of the result, and the fact that according to Eq. (33), dr/dQ. = 0 in 

the exact direction of particle’s propagation (#=0), Eq. (40) describes a narrow circular “hollow cone” 
of radiation. Another important aspect of this result is how fast does the maximum radiation brightness 
grows with the Lorentz factor y, i.e. with particle’s energy 3 = ymc . 

Still, the total radiated power V (into all observation angles) at linear acceleration is not too high 
for any practicable values of parameters. In order to show this, it is convenient to calculate r for an 
arbitrary motion of the particle first. It is possible to do this by a straightforward integration of Eq. (30) 
over the full solid angle, but let me demonstrate how r may be found (or rather guessed) from the 
general relativistic arguments. In Sec. 8.2, we have derived Eq. (8.27) for the electric dipole radiation 
for nonrelativistic particle motion. That result is valid, in particular, for one charged particle whose 
electric dipole moment’s derivative over time may be expressed as d(qr)/dt = {q/m) p, where p is 
particle’s mechanical momentum ( not its electric dipole moment). As the result, the Larmor formula 
(8.27) in free space, i.e. with v = c, reduces to 
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Total 
radiation 
power via p 


Total 
radiation 
power via p 


N 

O 

' q dp ' 

: Z 0 q 2 

( <7p 

d\> 2 

6kc 2 

v m dt y 

6 mi 2 c 2 

y dt 

dt y 


(10.35) 


This is evidently not a Lorentz-invariant result, but it gives a clear hint how such an invariant, that is 
reduced to Eq. (35) in the nonrelativistic limit, may be formed: 


r y 2 

Z 0 q 

dp „ 

r-7 2 

Z 0 q 


( dp^ 

2 1 


2 

6mn 2 c 2 

dz dz j 

6mi 2 c 2 


y dz y 

c 2 

1 ^ 



(10.36) 


Plugging in the relativistic expressions, p = ymcfi, Z 
be recast into the Lienard extension of the Larmor formula: 9 



ymc, and dr= dt/y, the last fonnula may 


(10.37) 


which may be also obtained by a direct integration of Eq. (30) over the full solid angle, thus confirming 
our guess. However, for some applications, it is beneficial to express Z 3 the via the time evolution of 
particle’s momentum alone. For that, we may differentiate the fundamental relativistic relation (9.78), 
Z = (me ) + (pc)~ , over the proper time r to get 


2Z 


dZ 

dz 



i.e. 


dZ 

dz 


c 1 p dp 
Z dz 



(10.38) 


where, at the last transition, the magnitude of the relativistic vector relation mentioned in Chapter 9, 

2 

c“p/<f = u, has been used. Plugging this relation into Eq. (36), we may rewrite it as 


r = 


r y 2 

Z 0 q 

6mi 2 c 2 


VpV 

\dz y 


(dp) 

2 

\dz y 

1 


(10.39) 


Note the difference between the squared derivatives in this expression: in the first of them we have to 
differentiate the momentum vector p, and only then form a scalar by squaring the resulting vector 
derivative, while in the second case, only the magnitude of the vector is differentiated. For example, for 
a circular motion with constant speed (to be analyzed in detail in the next section), the second tenn is 
zero, while the first one is not. 

However, if we return to the simplest case of linear acceleration (Fig. 4), then ( dp/dz ) = 
(dp/dz) , and Eq. (39) is reduced to 


: P = 


z o<r ( dp ( 

6m n 2 c 2 V dz 


(i-A)= 


Z 0 q (dp) 1 _ Z 0 q ( dp 


6mn 1 


dz 


y 2 67m 1 c 2 


\dt' j 


(10.40) 


(where t’ = t r is the time of emitting radiation as measured as in the lab frame), i.e. formally coincides 
with nonrelativistic Eq. (35). In order to get a better feeling of the magnitude of this radiation, we may 
use the fact that dp/dt = dZ/dz This allows us to rewrite Eq. (40) in the following form: 


9 The second form of Eq. (10.37), frequently more convenient for applications, may be readily obtained from the 
first one by applying MA Eq. (7.7a) to the vector product. 
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P = 


Z 0 q 2 (dZ V 

67m 2 c 2 \dz j 


Z 0 q 2 dZ dZ dt' _ Z 0 q 2 dZ dZ 
6mt 2 c 2 dz' dt' dz' 6mi 2 c 2 u dz' dt' 


(10.41) 


For the most important case of ultrarelativistic motion (u — > c ), this result may be presented as 


-P ^2 djZImc 2 ) 
dZ / dt' 3 d(z'/r c ) 


(10.42) 


where r c is the classical radius of the particle, given by Eq. (8.41). This formula shows that the radiated 
power, i.e. the change of particle’s energy due to radiation, is much smaller than that due to the 
accelerating field, unless energy as large as me is gained on the classical radius of the particle. For 
example, for an electron, such acceleration would require the accelerating electric field of the order of 
(0.5 MV)/(3xl0' 15 m) ~ 10 14 MY/m, while practicable accelerating fields are below I O ' MV/m, limited 
by the electric breakdown effects. Such smallness of radiative losses of energy is actually a large 
advantage of linear electron accelerators - such as the famous 2-mile-long SLAC 10 that can accelerate 
electrons or positrons to energies up to 50 GeV, i.e. to yx 10 5 . 


10.3. Synchrotron radiation 

Now let me show that in circular accelerators, the radiation is much larger. Consider a charged 
particle being accelerated in the direction perpendicular to its velocity u (for example by a the magnetic 
component of the Lorentz force), so that its speed u, and hence the magnitude p of its momentum, do not 
change. In this case, the second term in Eq. (39) vanishes, and it yields 


r y 2 

Z 0 q 

'dp'' 

2 72 

Z 0 q 

(dp 2 

6mn 2 c 2 

{dr ) 

67m 2 c 2 

, dt’ ; 


(10.43) 


Comparing this expression with Eq. (40), we see that for the same acceleration magnitude, the 
electromagnetic radiation is a factor of ;/ 2 larger. For modern accelerators, with y~ 1 0 4 - 1 0 5 , such a factor 
creates an enormous difference. For example, if a particle is on a cyclotron orbit in a constant magnetic 
field (as was analyzed in Sec. 9.6), both u and p = ymu obey Eq. (9.150), so that 


dp 

dt' 


= a c p = jp = P r- R 


(10.44) 


(where R is orbit’s radius), so that for the power of this synchrotron radiation, Eq. (43) yields 



(10.45) 


According to Eq. (9.153), at fixed magnetic field (in particle accelerators, limited to a few Tesla 
produced by the beam-bending magnets), the synchrotron orbit radius R scales as y, so that according to 
Eq. (45), A scales as i.e. grows fast with particle’s energy Z oc y. For example, for typical parameters 
of the first electron cyclotrons (such as the General Electric machine in which the synchrotron radiation 
was first noticed in 1947), R ~ 1 m, Z ~ 0.3 GeV (y~ 600), Eq. (45) gives a very modest electron energy 


10 See, e.g., https://www6.slac.stanford.edu/ . 
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loss per one revolution: PAt’ ~ InfPR/c ~ 1 keV. However, already by the mid-1970s, electron 
accelerators, with R~ 100 m, have reached energies d ~10 GeV, and the energy loss per revolution has 
grown to ~ 10 MeV, becoming the major energy loss mechanism. 11 However, what is bad for particle 
accelerators and storage rings is good for the so-called synchrotron light sources - the electron 
accelerators designed specially for the generation of intensive synchrotron radiation - with the spectrum 
extending well beyond the visible light range. Let us now analyze the angular and spectral distributions 
of such radiation. 

To calculate the angular distribution, let us select the coordinate axes as shown in Fig. 5, with 
the origin at the current location of the orbiting particle, axis z along its instant velocity (i.e. vector P), 
and axis x toward the orbit center. 



Fig. 10.5. Geometry of the synchrotron 
radiation problem. 


In the general case, the unit vector n toward the radiation observer is not within any of the 
coordinate planes, and hence should be described by two angles - the polar angle # and the azimuthal 
angle cp between the x axis and projection OP of vector n on plane [x, y]. Since the length of segment OP 
is sin#, the Cartesian coordinates of the relevant vectors are as follows: 

n = {sin# cos cp, sin# sin cp, cos#}, P = {0, 0, /?}, |3 = {/?, 0, o}. (10.46) 


Synchrotron 

radiation’s 

angular 

distribution 


Plugging these coordinates into the general Eq. (30), we get 


d'P 

dQ 


2Z 0 q 


n 


P 2 Y 6 f(0><p\ with /(#, cp) = 


1 


8y 6 (l-/?cos#) 3 


sin’ # cos 2 cp 
y 2 (l-/?cos#) 2 


(10.47) 


According to this result, just as at the linear acceleration, in the ultrarelativistic limit, most 
radiation goes to a narrow cone (of width A#~ y ] « 1) around vector p, i.e. around the instant direction 
of particle’s propagation. For such small angles, and y» 1, the second of Eqs. (47) is reduced to 


f{0,(p) 


1 

(1 + y 2 # 2 ) 3 


4y 2 # 2 cos 2 cp 

(1 + r # 2 ) 2 


(10.48) 


11 For proton accelerators, such energy loss is much less of a problem, because y of an ultrarelativistic particle (at 
fixed d) is proportional to 1/m, so that the estimates, at the same R, should be scaled back by ( m p !m e ) 4 ~ 10 13 . 
Nevertheless, in the giant modem accelerators such as the LHC (with R ~ 4.3 km and & » 7 TeV), the synchrotron 
radiation loss per revolution is rather noticeable (PAt’ ~ 6 keV), leading not as much to particle deceleration as to 
substantial photoelectron emission from the beam tube walls, creating harmful defocusing effects. 
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Left panel of Fig. 6 shows the angular distribution J{9, (p) color-coded, on the plane 
perpendicular to particle’s instant velocity (in Fig. 5, plane [x, v]), while its right panel shows the 
intensity as a function of 6 in two perpendicular directions: within the particle rotation plane (along axis 
x ) and perpendicular to this plane (along axis v). The result shows, first of all, that, in contrast to the 
case of linear acceleration, the narrow radiation cone is now not hollow: the intensity maximum is 
reached exactly at 0= 0, i.e. in particle’s motion direction. Second, the radiation cone is not axially- 
symmetric: the intensity drops faster within the particle rotation plane (and even has nodes at 0 = ±1 If). 



Let us consider the time/frequency structure of the synchrotron radiation, now from the point of 
view of the observer rather than the particle itself. (In the latter picture, due to the axial symmetry of the 
problem, the total radiation power 'P is evidently constant.) Its semi-quantitative picture may be 
obtained from the angular distribution we have just analyzed. Indeed, if an ultrarelativistic particle’s 
radiation is observed from a point in (or close to) the rotation plane, 12 the observer is being “struck” by 
the narrow radiation cone once each rotation period, each “strike” giving a pulse of a short duration At 
« co c - see Fig. 7. 


(a) 



n 



Fig. 10.7. (a) Synchrotron radiation cones at y» 1, and (b) the in-plane component of their electric field, 
observed in the rotation plane, as a function of observation time t - schematically. 


12 If the observation point is off-plane, or if the rotation speed is much less than c, the radiation is virtually 
monochromatic, with frequency co c . 
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The evaluation of the time duration At of each pulse requires some care: its estimate At’ ~ I / yco c 
is correct for the duration of the time of particle’s motion while its cone is aimed at the observer. 
However, due to the time compression effect, discussed in detail in Sec. 1 and described by Eqs. (12) 
and (18), the pulse duration as seen by observer is a factor of 1/(1 - ft) shorter, so that 

At = (\-P)A? . (10.49) 

yco c y (o c 

From the Fourier theorem, we can expect that the frequency spectrum of the radiation consists of 
numerous (N ~ y » 1) hannonics of the rotation frequency co c , with comparable amplitudes. However, 
if the orbital frequency fluctuates even slightly ( Sco c /co c > 1 /N~ 1/A as it happens in most practical 
systems, the radiation pulses are not coherent, so that the average radiation power spectrum may be 
calculated as that of one pulse, multiplied by number of pulses per second. In this case, the spectrum is 
continuous, extending from low frequencies all the way to approximately 

®ma x ~I/At~y 3 co c . (10.50) 

In order to verify this estimate, let us calculate the spectrum of radiation, due to a single pulse. 
For that, we should first make the general notion of spectrum quantitative. Let us present an arbitrary 
electric field (say that of the synchrotron radiation we are studying now), considered as a function of the 
observation time t (at fixed r), as a Fourier integral: 13 

+co 

E(t)= ^E w e~ i(0t dt. (10.51) 


This expression may be plugged into the following formula for the total energy of the radiation pulse 
(i.e. of particle energy’s loss) per unit solid angle: 14 


de_ 

da 


\s n {t)R 2 dt 


n 2 + 00 

— \\m\dt. 

^0 -00 


(10.52) 


This substitution, plus a natural change of integration order, yield 


d£_ 

~dO. 


q2 +cq +co +oo 

— J dco j dco' Eg, • E f(j , J dt . 

^0 -CO -co -CO 


(10.53) 


But the inner integral (over t) is just 2 n§{co + co’). 15 This delta- function kills one of the frequency 
integrals (say, one over co’), and Eq. (53) gives a result which may be recast as 


13 In contrast to the single-frequency case (i.e. a monochromatic wave), we may avoid taking real part of the 
complex function (E fd e'“*) if we require that E. w = E„*. However, it is important to remember the factor 14 
required for the transition to a monochromatic wave of frequency ox{. E w = E 0 [c%co- cod) + d)co + ctx>)]/2. 

14 Note that the expression under the integral differs from d'P/dQ. defined by Eq. (29) by the absence of term (1 - 

P-n) = dt’/dt. This is natural, because this is the wave energy arriving at the observation point r during time 
interval dt rather than dt ’. 

15 See, e.g. MA Eq. (14.3a). 
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d£_ 

dQ. 


+ AT ? 2 

\l{a)da, with l{(o) = —— E w ■E.g, = 4kZ q sI{cR) 2 K C0 E C0 , 

0 ^0 


(10.54) 


where the evident frequency symmetry of the scalar product E^-E , m has been utilized to fold the integral 
of 1(a) to positive frequencies only. The first of Eqs. (51) and the first of Eqs. (54) make the physical 
sense of function 1(a) clear: this is the so-called spectral density of the electromagnetic radiation (per 
unit solid angle, per unit pulse). 16 

In order to calculate the spectral density, we need to express function E a via E(f) using the 
Fourier transfonn reciprocal to Eq. (51): 


1 +CO 


(10.55) 


In the particular case of radiation by a single point charge, we should use the second term of Eq. (20a): 


= J q L f n x {(n — P) x (3 } j (0t ^ 

2k 4 K£ 0 cR (1 - P • n) 3 


(10.56) 


Since vectors n and p are natural functions of the radiation (retarded) time t\ let us use Eqs. (18) to 
change integration in Eq. (52) from the observation time t to time t’\ 
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J a> 


4 ns Q 2 k cR 


_L_L? 

J 7 r rT? J 


n x {(n - P) x p 

(1-P-n) 2 
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cxpi / co\ 
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t' + 

V c J 


>dt' . 


(10.57) 


The strong inequality R r » r ’ that is implied from the beginning of this section allows us to consider the 
unit vector n as constant and, moreover, to use approximation (8.19) to reduce Eq. (57) to 


E„ = 
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l 




2k 4 K£ n cR 


expf 


lar 
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n x {(n - P) x (3 

(1-P-n) 2 


exp <ia\ 
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nr 




\dt' . 




c ) 


(10.58) 


Plugging this expression into Eq. (54), we get 17 


l{a) = 


Z 0 q 

16^- 3 


n x {(n - P)x p 

(1-P-n) 2 


cxp< / co\ 


t'- 


n r' 


V 


y' 

c J) 


■dt'\ 


(10.59) 


Let me remind the reader that p inside this integral is supposed to be taken at the retarded point 
(r ’, t so that Eq. (59) is fully sufficient for finding the spectral density from the law r ’(t ’) of particle’s 
motion. However, this result may be further simplified by noticing that the fraction before the exponent 
may be presented as a full derivative over t ’, 


16 The notion of spectral density may be readily generalized to random processes - see, e.g., SM Sec. 5.4. 

17 Note that for our current purposes of calculation of spectral density of radiation by a single particle, factor 
exp{/&r/e} has got cancelled. However, as we have seen in Chapter 8, this factor plays the central role at 
interference of radiation from several (many) sources. In the context of synchrotron radiation, such interference 
becomes important in undulators and free-electron lasers - the devices to be (qualitatively) discussed below. 
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n x { (n - P) x dp/dt'} 
(1-P-n) 2 


d nx(nxp) 
dt' 1-p-n 


(10.60) 


and working out the resulting integral by parts. At this operation, the time differentiation of the 
parentheses in the exponent, d{t’ - nr ’/c)/dt’ = 1 - n-u/c = 1 - pn, leads to the cancellation of 
denominator’s remains and hence to a surprisingly simple result: 18 


l(a>) 


7 2 2 

Z 0 q co 

16x 3 
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"■S 
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-00 k 

k C J 
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(10.61) 


Returning to the particular case of synchrotron radiation, it is beneficial to choose the origin of 
time t’ so that at t’ = 0, angle 9 takes its smallest value Oq, i.e., in terms of Fig. 5, vector n is within 
plane [v, z]. Fixing this direction of axes in time, we can redraw that figure as shown in Fig. 7. In these 
coordinates, 

n = {0, sin 9 0 , cos #„ }, r ' = {f?(l -cosa), 0, Rsina}, p = {/?sina, 0,/?cosa}, (10.62) 


where a = cod and an easy multiplication yields 

nx(n x p) = /?{sin a, sin# 0 cos9 0 cosa, - sin 2 9 0 sin«}, 


exp Uco t' - 


nr 
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exp< / co\ 
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t cosfc^,, sin a 


V 


(10.63) 

(10.64) 



Fig. 10.7. Deriving the spectral density of 
synchrotron radiation. Vector n is fixed in 
plane \y, z], while vectors r \t ’) and P(t’) 
rotate in plane [x, y ] with angular velocity ay. 


As we already know, in the (most interesting) ultrarelativistic limit y » 1 , most radiation is 
confined to short pulses, so that only small angles a ~ co c At’ ~ y A may contribute to the integral in Eq. 
(61). Moreover, since most radiation goes to small angles 0 ~ y' 1 , it makes sense to consider only small 
angles 9o ~ y' 1 « 1. Expanding both trigonometric functions of these small angles, participating in 
parentheses of Eq. (64), into Taylor series, and keeping only terms up to 0(y ’ 3 ), we can present them as 


( , R a ■ \ 

t cos 9 0 sin a 

v c J 


f , R , R 0o , 

t co r t' + -co r t' + 

^ c cl c 


R colt ' 3 ^ 


(10.65) 


18 Actually, this simplification is not occasional. According to Eq. (10b), the expression under the derivative is 
just the transverse component of the vector-potential A (give or take a constant factor), and from the discussion in 
Sec. 8.2 we know that this component determines the electric dipole radiation of the particle (which dominates the 
radiation field in our current case of uncompensated electric charge). 
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Since (R/c)co c = u/c = 1, in two last terms we may approximate this parameter by 1. However, it is 

crucial to distinguish the difference of two first terms, proportional to (1 - p)t\ from zero, and as we 
have done before we may approximate it with t’Hy". In Eq. (63), which does not have such critical 
differences, we may be more bold, taking 19 

nx(nxp) « {a, O 0 , 0}= {co c t', O 0 , 0}. . (10.66) 


As a result, Eq. (61) is reduced to 
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where a x and a y are the dimensionless factors, 



(10.67) 
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( 10 . 68 ) 


which describe the frequency spectra of two components of the synchrotron radiation, with mutually 
perpendicular directions of polarization. Defining a dimensionless parameter 
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proportional to the observation frequency, and changing the integration variable to £ = Q) c t’/(0o + y ) , 
integrals (68) may be reduced to the modified Bessel functions of the second kind: 
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Figure 8a shows the dependence of amplitudes a x and a v of the normalized observation frequency 
v. It is clear that the in-plane component, proportional to a x , is larger. (The off-plane component 
disappears altogether at Oo = 0, i.e. at observation within the particle rotation plane [x, y ], due to the 
evident mirror symmetry of the problem about the plane.) It is also clear that the spectrum changes 
rather slowly (note the log-log scale of the plot!) until the normalized frequency, defined by Eq. (69), 
reaches ~1. For most important observation angles Oo ~ y this means that our estimate (50) is indeed 
correct, though theoretically the frequency spectrum extends to infinity. 20 


19 By the way, this expression shows that the in-plane (x) component of the electric field is an odd function of t ’ 
(and hence t - see its sketch in Fig. 7), while the perpendicular component is an even function of time. Also notice 
that for an observer exactly in the rotation plane = 0) the latter component vanishes. 

20 The law of the spectral density decrease at large v may be readily obtained from the second of Eqs. (2.158) 
which is valid even for any (even non-integer) Bessel function index n: a x cc a v oc v 1 2 cxp{- v\. Here the 
exponential factor is certainly most important. 
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(a) (b) 



amplitudes and (b) the total (polarization- and angle-averaged) radiation. 


Naturally, a similar frequency behavior is valid for the spectral density integrated over the full 
solid angle. Without performing the integration, 21 let me give the result (also valid for y» 1 only) for 
reader’s reference: 

ii(w)da = ^q 1 rdK ill (^, o = (10.71) 

Figure 8b shows the dependence of this integral on the normalized frequency Q. (This plot is sometimes 
called the “universal flux curve”.) In accordance with estimate (50), it reaches maximum at 


oo „ 


Cax~0-3, i-e.® max . 


(10.72) 


For the new National Synchrotron Light Source (NSLS-II), that is under construction in the 
Brookhaven National Laboratory very close to our campus, with the ring circumference of 792 m, the 
electron revolution period T will be 2.64 ps. Calculating oo c as 2nlT « 2.4xl0 6 s' 1 , for the planned y « 
6xl0 3 {3 « 3 GeV), 22 we get com ax ~ 3xl0 17 s' 1 , corresponding to photon energy ficOm ax ~ 200 eV, 
corresponding to soft X-rays. In the light of this estimate, the reader may be surprised by Fig. 9 that 
shows the projected spectra of radiation which this facility is designed to produce, with maximum 
photon energies up to a few keV. 

The reason of this discrepancy is that in NLLS-II, and in all modern synchrotron light sources, 
most radiation is produced not by the circular orbit itself, but rather using special devices inserted into 
the electron beam path. These devices include bend magnets with magnetic field stronger than the 
average field on the orbit (which, according to Eq. (9.112), produce higher effective value of oo c and 


21 For that, and many other details, the interested reader may be referred, for example, to the fundamental review 
collection by E. E. Koch et al. (eds.) Handbook on Synchrotron Radiation (in 5 vols.), North-Holland, 1983-1991, 
or a more concise monograph by A. Hofmann, The Physics of Synchrotron Radiation, Cambridge U. Press, 2007. 

22 By modem standards, this energy’ is not too high. The distinguished feature of NSLS-II is its unprecedented 
electron beam intensity (planned average beam current up to 500 mA) which should allow an extremely high 
synchrotron “brightness” I(a>). 
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hence of nw), and wigglers and undulators: strings of several strong magnets with alternating field 
direction (Fig. 10), that induce periodic bending of electron trajectory, with radiation emitted at each 
bend. 



Photon Energy 

Fig. 10.9. Design brightness of various synchrotron radiation sources of the NSLS-II facility. For bend 
magnets and wigglers, the “brightness” may be obtained by multiplication of the spectral density I(oI) 
from one electron pulse, calculated above, by the number of electrons passing the source per second. 
(Note the non-SI units, commonly used in the synchrotron radiation community.) Flowever, for 
undulators, there is an additional factor due to the partial coherence of radiation - see below. (Data from 
document NSLS-II Source Properties and Floor Layout, available online at http://www.nsls.bnl.gov/ .) 



Permanent Magnets 


Fig. 10.10. The generic magnetic structure 
common for wigglers, undulators and free- 
electron lasers. (Adapted from http://www- 
xfel.spring8.or.jp/cband/e/Undulator.htm .) 
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The difference between wigglers and undulators is more quantitative than qualitative: the fonner 
devices have a larger spatial period A (distance between the adjacent magnets of the same polarity, see 
Fig. 10), giving enough space for the electron beam to bend by an angle larger than y~ , i.e. larger than 
the radiation cone angle. As a result, the pulses radiated at each period arrive to an in-plane observer as 
a series of individual pulses (Fig. 1 la). The shape of each pulse, and hence its frequency spectrum, are 
similar to those discussed above, 23 but with much higher local values of a> c and nw - see Fig. 9. 
Another difference is a much higher frequency of the peaks. Indeed, the fundamental Eq. (18) allows us 
to calculate the time distance between them, for the observer, as 


Af » A A f'«(l- A)— 

Ot U 


1 /l 2 

2 y c c 


(10.73) 


where the first two relations are valid at A « R (the relation typically satisfied very well, see Fig. 9), 
and the last two relations also require the ultrarelativistic limit. As a result, the radiation intensity, that is 
proportional to the number of poles, is much higher than that from the bend magnets - in the NLSL-II 
case, more than by 2 orders of magnitude, clearly visible in Fig. 9. 



(b) 

At 


WWW 


> 

t 


Fig. 10.1 1. Radiation (with in-plane polarization) from (a) a wiggler and (b) an undulator- schematically. 


The situation in different in undulators - similar structures with smaller spatial period A, in 
which electron’s velocity vector oscillates with angular amplitude smaller that y\ As a result, the 
radiation pulses overlap (Fig. lib) and the radiation waveform is closer to sinusoidal one. As a result, 
the radiation spectrum narrows to the central frequency 24 


co 0 


2n j 2 7V c 

— «2 y 

At A 


(10.74) 


For example, for the LSNL-II undulators with A = 20 mm, this fonnula predicts the radiation peak at 
phonon energy h(OQ ~ 4 keV, in a reasonable agreement with results of quantitative calculations, shown 


23 A small problem for the reader: use Eqs. (20) and (63) to explain the difference between shapes of pulses 
generated at opposite magnetic poles of the wiggler, that is schematically shown in Fig. 11a. 

24 This important formula may be also interpreted in the following way. Due to the relativistic length contraction 
(9.20), the undulator structure period as perceived by beam electrons is A’ = A/y, so that the central frequency of 
radiation is coq’ = 2nd A’ = Incy/A. For the lab-frame observer, this frequency is Doppler-upshifted according to 
Eq. (9.44): a>o = fflb’[(l + /?)/(! - /?)] 1/2 « 2 ya>o giving the same result as Eq. (74). 
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in Fig. 9. 25 Due to the spectrum narrowing, the intensity of undulators radiation is higher that that of 
wigglers using the same electron beam. 

This spectrum-narrowing trend is brought to its logical conclusion in the so-called free-electron 
lasers 26 whose basic structure is the same as that of wigglers and undulators (Fig. 10), but the radiation 
at each beam bend is so intense and narrow-focused that it affects the electron motion downstream the 
radiation cone. As a result, the radiation of all bends becomes synchronized, so its spectrum is a narrow 
line at frequency (70), with electromagnetic wave amplitude proportional to the number N of electrons 
in the structure, and hence its power proportional to N 2 (rather than to N as in wigglers and undulators). 

Finally, note that wigglers, undulators, and free-electron lasers may be also used at the end of a 
linear electron accelerator (such as SLAC) that, as was noted above, may provide extremely high values 
of y, and hence radiation frequencies (70), due to the absence of the radiation energy losses at the 
electron acceleration stage. 


10.4, Bremsstrahlung and Coulomb losses 

Surprisingly, a very similar mechanism of radiation by charged particles works at much lower 
spatial scale, namely at their scattering by charged particles of the propagation medium, the so-called 
bremsstrahlung - German for “brake radiation”. This effect responsible, in particular, for the continuous 
part of the frequency spectrum of the radiation produced by standard vacuum X-ray tubes, its incidence 
on a solid “anticathode”. 27 

The bremsstrahlung in condensed matter is generally a rather complicated phenomenon, because 
of simultaneous involvement of many particles, and some quantum electrodynamic effect involvement. 
This is why I will give only a very brief glimpse at the theoretical description of this effect, for the 
simplest case when scattering of incoming, relatively light charged particles (such as electrons, protons, 
a-particles, etc.) is produced by atomic nuclei that remain virtually immobile during the scattering event 
(Fig. 12). This is a reasonable approximation if the energy of incoming particles is not too low, 
otherwise most scattering is produced by atomic electrons whose dynamics is substantially quantum - 
see below. 



Fig. 10.12. Basic geometry 
of the bremsstrahlung and 
Coulomb loss problems in 
(a) direct and (b) reciprocal 
space. 


25 Much of the difference is due to the fact that that those plots show the spectral density of the number of photons 
n = &lh(o\)<zx second, which peaks above the density of power, i.e. energy per second. 

26 This name is somewhat misleading, because in contrast to the usual (“quantum”) lasers, the free-electron laser 
operation is essentially classical and very similar to that of vacuum-tube microwave generators (such as 
magnetrons briefly discussed in Sec. 9.6) - see, e.g., E. Salin et al.. The Physics of Free Electron Lasers, 
Springer, 2000. 

27 Such X-ray radiation had been observed experimentally, though not correctly interpreted by N. Tesla in 1887, 
i.e. before the radiation was studied in detail (and much publicized) by W. Rontgen. 
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To calculate the frequency spectrum of radiation emitted during a single scattering event, it is 
convenient to use a byproduct of the last section’s analysis, namely Eq. (59) with replacement (60): 28 
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(10.75) 


The typical duration r of a single scattering event, that is described by this formula, is of the 
order of afc ~ (10 10 m)/(3xl0 8 m/s) ~ 10' 18 s in solids, and only an order of magnitude longer in gases 
at ambient conditions. This is why for most frequencies of interest, from zero all the way up to at least 
soft X-rays, 29 we can use the so-called low-frequency approximation, taking the exponent in Eq. (75) for 
1 through the whole collision event, i.e. the integration interval. This approximation immediately yields 
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(10.76) 


In the nonrelativistic limit (J3 ini , ff n « 1), this formula in reduced to 30 
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(10.77) 


where f is the momentum transferred from the scattering center to the scattered charge (Fig. 12): 31 

t = Pfin ~ P™ = l,lAU = mCA V = WC (P fin — P ini ) ’ ( 10 - 78 ) 


and 6 is the angle between vector f and the direction n toward the observer. 

The most important feature of result (77) is the frequency-independent (“white”) spectrum of the 
radiation, very typical for any rapid leaps, which may be approximated as theta-functions of time. (Note, 
however, that this is only valid for a fixed value of f, so that the statistics of this parameter, to be 
discussed in a minute, “colors” the radiation.) Note also the angular distribution of the radiation, 
forming the usual “doughnut” shape about the momentum transfer vector f. In particular, this means that 
in typical cases when f ~ p, the bremsstrahlung produces a significant radiation flow in the direction 
back to the particle source - the fact significant for the operation of X-ray tubes. 

Now integrating over all wave propagation angles, just as we did for the instant radiation power 
in Sec. 8.2, we get the spectral density of the full energy loss, 


28 In publications on this topic (whose development peak was in the 1920s and 1930s), Gaussian units are more 
common, and letter Z is usually reserved for expressing charges as multiples the fundamental charge e, rather 
than for the wave impedance. This is why, in order to avoid confusion, in this section I will use 1 /s 0 c = Z 0 for the 
free-space wave impedance and, still sticking to the same SI units as used through my lecture notes, will write the 
coefficients in a form that makes the transfer to the Gaussian units trivial: it is sufficient to replace all ( qq 7'4/r«i)si 
with (qq jGaussian- In the (rare) cases when I spell out the charge values, I will use a different font: q = ~fe, q’ = ~f’e. 

29 A more careful analysis shows that this approximation is actually quite reasonable up to much higher 
frequencies of the order off It. 

30 Evidently, this result (but not the general Eq. (76)!) may be derived from Eq. (8.27) as well. 

31 Please note the font-marked difference between this variable (f) and particle’s electric charge (q). 
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The main new feature of bremsstrahlung (as of most scattering problems 32 ), is the necessity to 
take into account the randomness of the impact parameter b (Fig. 12). For elastic (J3 ini = Pj m = p) 
Coulomb collisions we can use the so-called Rutherford formula for the differential cross-section of 
scattering 33 
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(10.80) 


Here da = 2jrbdb is the elementary area of the sample cross-section (as visible from the direction of 
incident particles) corresponding to particle scattering into an elementary body angle 34 

dQ.' = 27Tsm6'\dO '\ . (10.81) 


Differentiating the geometric relation that is evident from Fig. 12, 
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f = 2psin — , (10.82) 


we may present Eq. (80) may be presented in a more convenient form 
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Now combining Eqs. (79) and (83), we get 
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This product is called the differential radiation cross-section. When averaged it over all values f 
(which is equivalent to averaging over all values of the impact parameter), it gives a convenient measure 
of radiation intensity. Indeed, after the multiplication by the volume density n of independent scattering 
centers, the integral gives particle’s energy loss by unit bandwidth of radiation by unit path length - 
dr&ldcodx. A technical problem here is that the integral of I if formally diverges at both infinite and zero 
values of f. However, these divergences are very weak (logarithmic), and the integral converges due to 
virtually any reason unaccounted for by our simple analysis. The standard simple way to account for 
these effects is to write 
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(10.85) 


32 See, e.g., CM Sec. 3.7. 

33 See, e.g., CM Eq. (3.72) with constant a = qq’!4nsQ. In the form used in Eq. (80), the Rutherford formula is 
also valid for small-angle scattering of relativistic particles, the criterion being | A0 | « 2/y. 

34 Angle 9’ and differential dbl ’, describing the direction of scattered particles, should not be confused with 0 and 
dD. describing directions of the radiation emitted at the scattering event. 
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and then plug, instead of p max and pm in , scales of the most important effects limiting the small momentum 
range. At classical analysis, according to Eq. (82), ^m ax = 2p. To estimate p mn , let us note that very small 
momentum transfer takes place when the impact parameter b is very large and hence the effective 
scattering time r ~ b/v is very long. Recalling the condition of the low-frequency approximation, we 
may associate fm m with t ~ Mo) and hence with b ~ u z ~ v/a>. Since for the small scattering angles, p 
may be estimated as the impulse Ft ~ {qq ’I4it£pb )r of the Coulomb force, so that f mm ~ {qq ’/Atcsq)co/u , 
and Eq. (85) becomes 

Classical 
brems- 
strahlung 

This is Bohr’s formula for what is called the classical bremsstrahlung. We see that the low 
momentum cutoff indeed makes the spectrum colored, with more energy going to lower frequencies. 
There is even a formal divergence at co — > 0; however, this divergence is integrable, so it does not 
present a problem in finding the total energy radiative losses {-d£!dx) as an integral of Eq. (86) over all 

radiated frequencies co. A larger problem for this procedure is the upper integration limit, — » oo, at 

which the integral diverges. This means that our approximate description, which considers the collision 
as an elastic process, becomes wrong, and needs to be amended by taking into account the difference 
between the initial and final kinetic energies of the particle due to radiation of the energy quantum hco 
of the emitted photon: 
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^L-^4 = hco. (10.87) 
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As a result, taking into account that the minimum and maximum values of f correspond to, respectively, 
the parallel and antiparallel alignments of vectors p ,>„■ and p/;„, we get 
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Plugged into Eq. (85), this expression yields the so-called Bethe-Heitler formula for quantum 
bremsstrahlung , 35 Note that at this approach, p mx is close to that of the classical approximation, but p mn 

~ hco/u, so that 
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where z and z ’ are particles’ charges in units of e, and a is the fine structure (“Sonunerfeld”) constant, 


a = 


4 ne^fic 


e~ 

he 


1 


Gaussian 


137 


« 1. 


(10.90) 


35 The modifications of this formula necessary for the relativistic case description are surprisingly minor - see, 
e.g., Chapter 15 of J. D. Jackson, Classical Electrodynamics, 3 rd ed., Wiley 1999. For more detail, the standard 
reference monograph on bremsstrahlung is W. Heitler, The Quantum Theory of Radiation, 3 rd ed., Oxford U. Press 
1954 (reprinted in 2010 by Dover). 
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which is one of the basic notions of quantum mechanics. 36 For most cases of practical interest, ratio (89) 
is smaller that 1, and since we have to keep the highest value of f mm , the Bethe-Heitler formula should 
be used. 


Now nothing prevents us from calculating the total radiative losses of energy per unit length: 
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where h r^ Tiax = is the maximum energy of the radiation quantum. By introducing the dimensionless 
integration variable £, = fico! <C= 2 //col (mu 1 12) this integral is reduced to the table one, 37 and we get 
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In my usual style, I would give you an estimate of the losses for a typical case; however, let me 
compare them to a parallel energy loss mechanism, the so-called Coulomb losses, due to the transfer of 
mechanical impulse from the scattered particle to the scattering center. (This energy eventually goes into 
an increase of the thermal energy of the scattering medium.) Using Eqs. (9.139) for the electric field of a 
linearly moving charge, we can readily find the momentum it transfers to charge q ’: 38 
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Hence, the kinetic energy acquired by the scattering center (equal to the loss of energy of the incident 
particle) is 
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Such energy losses have to be summed up over all collisions, with random values of the impact 
parameter b. At the scattering center density n, the number of collisions per small path length dz per 
small range db is dN = n2nbdbdx, so that 
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where B = 


(10.95) 


Here the logarithmic integral over b was treated similarly to that over f in the bremsstrahlung 
theory. This approach is adequate, because the ratio b ma Jb m { n is much larger than 1 . Indeed, b mm may be 
estimated from (A p ’)max ~p = ymu. For this value, Eq. (93) with q’ ~ q gives b mm ~ r c (see Eq. (8.41) and 
its discussion), which is, for elementary particles, of the order of 10‘ 15 m. On the other hand, for the most 
important case when charges q ’ belong to electrons (which, according to Eq. (94) are the most efficient 


36 See, e.g., QM Secs. 6.3, 9.3, 9.5, and 9.7. 

37 See, e.g., MA Eq. (6.14). 

38 According to Eq. (9. 139), E z =0, and the net impulse of the longitudinal force q ’E x is zero. 
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Coulomb energy absorbers, due to their extremely low mass m ’), b max may be estimated from condition 
r = blyu ~ l/tfw, where t«fo ax ~ 10 16 s' 1 is the characteristic frequency of electron transitions in atoms. 
(Below this frequency, our classical analysis of scatterer’s motion is invalid.) From here, we have the 
estimate b max ~ yu! co mdX , so that 
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for y~ 1 and u ~ c « 3x10 m/s giving h max ~ 3x10' m, and B ~ 10 (give or take a couple orders of 
magnitude - this does not change the estimate in B ~ 20 too much). 39 


Now we can compare the Coulomb losses (95) with those due to the bremsstrahlung, given by 
Eq. (92): 
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Since a~ 10"" « 1, for nonrelativistic particles (J3« I ) the Coulomb losses of energy are much higher, 
and only for ultrarelativistic particles, the relation may be opposite. 

According to Eq. (95), for electron-electron scattering (q = q ’ = -e, m ’ = m e ), 40 at the value n 

26 3 

6x10" m" typical for air at ambient conditions, the characteristic length of energy loss, 
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for electrons with kinetic energy £= 6 keV is close to 2xl0" 4 m = 0.2 mm. (This is why you need 
vacuum in CRT monitors and electron microscope columns!) Since l c oc t? , more energetic particles 
penetrate deeper, until the bremsstrahlung steps in at very high energies. 


10.5. Density effects and the Cherenkov radiation 

For condensed matter, the Coulomb loss estimate made in the last section is not quite suitable, 
because it is based on the upper cutoff b max ~ yulcOm ax . For the example given above, incoming electron 
velocity u is close to 5x10 m/s, and for the typical value C 0 m ax ~ 10 s-1 (frco tax ~ 10 eV), this cutoff 

bmax ~ 5x1 O' 9 m = 5 nm. Even for air at ambient conditions, this is larger than the average distance (~ 2 
mn) between the molecules, so that at the high end of the impact parameter range, at b ~ b max , the 
Coulomb loss events in adjacent molecules are not quite independent, and the theory needs corrections. 
For condensed matter, with much higher particle density n, most collisions satisfy condition 


39 A quantum analysis (carried out by H. Bethe in 1940) replaces, in Eq. (95), in B with ln(2 fmu 2 /ti(co)) - // 2 , 
where (co) is the average frequency of the atomic quantum transitions weight by their oscillator strength. This 
refinement does not change the estimate given below. Note that both the classical and quantum formulas describe, 
a fast increase (as Up) of the energy loss rate (- d&dx ) at y— > 1 and its slow increase (as Iny) at y— > oo, so that the 
losses have a minimum at (y- 1) ~ 1. 

40 Actually, the above analysis has neglected the change of momentum of the incident particle. This is legitimate 
at m ’ « 777 , but for m = m ’ the change approximately doubles the energy losses. Still, this does not change the 
order of magnitude of the estimate. 


Chapter 10 


Page 25 of 38 





Essential Graduate Physics 


EM: Classical Electrodynamics 


nb 3 » 1, (10.99) 

and the treatment of Coulomb collisions as independent events is completely inadequate. However, 
condition (99) enables the opposite approach: treating the medium as a continuum. In the time domain 
formulation, used in the previous sections of this chapter, this would be a very complex problem, 
because it would require an explicit description of medium dynamics. Here the frequency-domain 
approach, based on the Fourier transform in both time and space, helps a lot, provided that functions 
&{ co) and //( co) are considered kn own - either calculated or taken from experiment. Let us have a good 
look at such approach, because it gives some interesting (and practically important) results. 

In Chapter 6, we have used the macroscopic Maxwell equations to derive Eqs. (6.109), which 
describe the time evolution of potentials in a medium with frequency-independent a and //. Looking for 
all functions participating in Eqs. (6. 109) in the fonn of plane- wave expansion 41 

fix , t) = J d 3 k J dcof Ko / (k - r -° Jt) , (10.100) 

and requiring all coefficients at similar exponents to be balanced, we get their Fourier image: 42 

[k 2 =— , [k 2 =juj k '„. (10.101) 

a 

As was discussed in Chapter 7, in such a Fourier form, the Maxwell theory remain valid even for the 
dispersive media, so that Eq. (101) is generalized as 


[k 2 -(o 2 ai<x>)/u{co)]l) K [0 = ^-, [k 2 - of e(a>\ /f/(®)]A k = ju(co) } k (10.102) 

a{c o) 


The evident advantage of these equations is that their formal solution is trivial: 




a(co) 


Pk,o> 

- co 2 a(co) ju(co) 


k ,co 


k 2 - co 2 a(a>) ju(a>) 


(10.103) 


Field 

potentials 
in a linear 
medium 


so that the “only” remaining things to do is to calculate the Fourier transforms of functions p{ r, t) and 
j(r, t), describing stand-alone charges and currents, using the transform reciprocal to Eq. (100), with one 
factor \l2n per each scalar dimension, 


f 1,(0 




-^d 3 r^dtf{r,t)e ^ kr cot \ 


(10.104) 


and than carry out the integration (100). 

For our current problem of a single charge q, uniformly moving in the medium with velocity u, 

p(r,t) = qS(r-ut), j(r,f) = quS(r -ut) , (10.105) 

the first task is easy: 


41 All integrals here and below are in infinite limits, unless specified otherwise. 

42 As was discussed in Sec. 7.2, the Ohmic conductivity of the medium (generally, also a function of frequency) 
may be readily incoiporated into the dielectric permittivity: A®) ~ > ^\((>->) + io{co)/co. In this section, I will assume 
that such incorporation, which is especially natural for high frequencies, has been performed, so that the current 
density j(r, t) describes only stand-alone currents - for example, the current (105) of the incident particle. 
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A,® q8(r-xit)e 


(In) 


-i(k-r-cot) _ q 

“( 2 x 


j(a)t-k ut) dt q_ 


(2 n\ 


-£(®-k-u). (10.106) 


Since expressions (105) for p(r, t) and j(r, t) differ only by a constant factor u, it is clear that the 
absolutely similar calculation for current would give 


J 


qu 


k ,a> 


(2 n) : 


-S(co-k u) . 


(10.107) 


Let us summarize what we have got by now, plugging Eqs. (106) and (107) into Eqs. (103): 




1 


qS(co-k -U) 


A = T 

(2n) 3 £(co) k 2 -co 2 £(co)ju(co) ’ k (2 n) 2 k 2 -co 2 £(co)ju(co) 


1 ju{co)qu8(co-ku) 


= £((Q)/u{cQ)x\(j) (10.108) 


Now, at the last step of calculations, namely integration (100), we are starting to pay a heavy 
price for the easiness of the first steps. This is why let us think well what exactly do we need from it. 
First of all, for the calculation of power losses, the electric field is more convenient to use than the 
potentials, so let us calculate the Fourier images of E and B. Plugging expansion (100) into the 
fundamental relations (6.106), and again requiring the balance of exponent’s coefficients, we get 

e m, =i[a>£(G)M®)u-k]0 k6> , B M = ikxA M =if(ffl)//(ffl)kxu^ iffl , (10.109) 

so that Eqs. (100) and (108) yield 


£(co) k 2 - (Q 2 £((Q) /Ll((Q) 


. ( 10 . 110 ) 


With the notation used in Eq. (5 1), this integral may be partitioned as 


E(r,0 = Jl 


E e~ iat dco, E, 


=1 


E k y k "V 3 i - -!2-r 

k-CO \3 


(2n) 


\cQ£(<X>) /u{co)u -k S(a>-k -u) 
£(co) k 2 - or £(co) ju(co) 


e' Rr t/ 3 yt.(10.111) 


Let us calculate the Cartesian components of the partial Fourier image E ( ,„ at a point separated 
by distance b from particle’s trajectory. Selecting the coordinates and time origin as shown in Fig. 9.1 la, 
we have r = {0, b, 0}, so that only E x and E y are not vanishing. In particular, according to Eq. (Ill), 


(E x ) co = 


iq 


( 2 ny £{co) 


| dk x | dk v | dk 


co£(co) /u(co)u - k , . ikyb 

— -d{co-k x ii)e } 


z I 2 


k - CO" £(cd)/u(cd) 


The delta-function kills one integral (over k x ) of three, and we get: 


(EX = 


iq 


(In) 2 £(co)u [ 


co 


co£(co) ju(co)u 




dk . 


co 2 / u 2 + k 2 + k 2 - co 2 £{co) /u{co) 


( 10 . 112 ) 


. (10.113) 


The last integral (over k y ) may be readily reduced to the table integral \dg/( I + g 2 ), in infinite limits, 
equal to n , 43 The result may be presented as 


43 See, e.g., MA Eq. (6.5a). 
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(*,).=- 


inq/c~ 


I; 


ikyb 


( 2 7T) 3 COS(co) } (kl+K 2 ) 


dk„ 


y /2 y 


(10.114) 


where parameter k (generally, a complex function of frequency) is defined as 

' 1 ^ 


2 2 

K = CO 


— r - £(ca) jii(ca) 
\u j 


(10.115) 


The last integral may be expressed via the modified Bessel function of the second kind: 44 


(2 n) cos{co) 


(10.116) 


A similar calculation yields 


(£,), = 


qrc 


{In') s{co) 




(10.117) 


Now, instead of rushing to make the final integration (111) over frequency to calculate E(f), let 
us realize that what we need for power losses is only the total energy loss through the whole time of 
particle passage. Energy loss per unit volume is 

dt, (10.118) 

where j is the current of bound charges in the medium, and should not be confused with the free 
particle’s current (105). This integral may be readily expressed via the partial Fourier image E m and the 
similarly defined image } 6> , just as it was done at the derivation of Eq. (54): 

-^ = \dt\ dcoe~ io)t | do)'e~ io)t \ (0 • E ( , = 2 n\ dco\ do' \ m -E C0 ,S{co + of) = 2nj j a -E _Ja>. (10.1 19) 


In our approach, the Ohmic conductance is incorporated into the complex permittivity dco), so that, 
according to the discussion in the end of Sec. 7.2, current’s Fourier image is 

L = GW = -ico£(co)E m . (10.120) 

As a result, Eq. (119) yields 

= -2ni\ s(co) E ffl - Evaded = Anlm\ s(qco\E Jd codco . (10.121) 

A V J J 


(The last transition is possible due to the property £(-&>) = £*(a>), which was discussed in Sec. 7.2.) 

Finally, just as in the last section, we have to calculate the energy loss rate averaged over random 
values of the impact parameter b : 


44 As a reminder, the main properties of these functions are listed in Sec. 2.5 of these notes - see, in particular, 
Fig. 2.20b and Eqs. (2. 1 57)-(2. 158). 
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Radiation 

intensity 


de__t 

dx 


de_ 

dV 


d 2 b = 2n j 


d 

dV , 


bdb = 8;r 2 


\bdb\ 


0 



2 

+ 

co 


^lm£(co)ood<x> . (10.122) 


Note that we are cutting the resulting integral over b from below at some b m]n where our theory looses 
legitimacy. (On that limit, we are not doing much better than in the past section). Plugging in the 
calculated expressions (116) and (117) for field components, swapping the integrals, and using 
recurrence relations (2.142), which are valid for any Bessel functions, we finally get: 


dt_ 

dx 


2 

n 


q 2 Im{ (K*b min )K { ( Kb mm )K 0 (K-\in ) 
0 


dco 

cos{co) 


(10.123) 


This general result is valid for an arbitrary linear medium, with arbitrary dispersion relations 
s{ co) and /fed). (The last function participates in Eq. (123) only via Eq. (115) which defines parameter 
k.) To get more concrete results, some particular model of the medium should be used. Let us explore 
the Lorentz oscillator model, which was discussed in Sec. 7.2, in its fonn (7.33) suitable for transition to 
quantum-mechanical description of atoms: 


s(co) = £ 0 + 


nq ~ ^ fj 

m j ( co 2 -nr) -2/ (oSj ’ 


Z// = l , = 


(10.124) 


If the damping of the effective atomic oscillators is low, 8j « COj, and particle’s speed u is much lower 
than the typical wave’s phase velocity v (and hence c!), then for most frequencies Eq. (115) gives 


2 2 

K = CO 


V‘(®). 


CO 


2 ’ 


(10.125) 


i.e. k = k* ~ co/u is real. In this case, Eq. (123) may be shown to give Eq. (95) with 




1.123m 


co 


(10.126) 


Good news here is that both approaches (the microscopic analysis of Sec. 4 and the macroscopic 
analysis of this section) give essentially the same result. This fact may be also perceived as bad news: 
the treatment of the medium as a continuum does not give any new results here. The situation somewhat 
changes at relativistic velocities at which such treatment provides noticeable corrections (called density 
effects ), in particular reducing the energy loss estimates. 

Let me, however, skip these details and focus on a much more important effect described by our 
formulas. Consider the dependence of the electric field components on the impact parameter b, i.e. on 
the closest distance between particle’s trajectory and the field observation point. If > 0, then k is real, 
and we can use, in Eqs. (1 16)-(1 17), the asymptotic formula (2.158), 







at d, — » oo , 


(10.127) 


to conclude that the complex amplitudes E co of both components E x and E v of the electric field decrease 
exponentially, starting from b ~ nJ{co). However, let us consider what happens at frequencies where tC < 
0, i.e. 
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s(a>)ju(a>) = ^— <\<\ = s oJ u 0 . (10.128) 

W(<z>) u c 

(This condition means that particle’s velocity is larger than the phase velocity of waves, at this particular 
frequency.) In these intervals, /vis purely imaginary, 45 functions exp { id ) } become just phase factors, and 

|£ x (©)|«|£/®)|*^. (10.129) 

This means that the Poynting vector drops as Mb, so that its flux through a surface of a round cylinder of 
radius b, with the axis on the particle trajectory (i.e. power flow), does not depend on b. Hence, this is 
wave emission - the famous Cherenkov radiation , 46 

The direction of its propagation may be readily found taking into account that at large distances 
from particle’s trajectory the emitted wave has to be locally planar, so that the Cherenkov angle 6 may 
be found from the ratio of the field components (Fig. 13a): 



Fig. 10.13. (a) Chere nk ov radiation’s propagation angle 6, and (b) its interpretation. 


This ratio may be calculated by plugging the asymptotic formula (127) into Eqs. (116) and (117) 
and calculating their ratio: 

tan# = = l -^~ = \s{co) /u{co)u 2 -1 

E v co 

y 

so that 


' 1/2 


, 1/2 


v 2 (co) 


-1 


(10.131a) 


45 Strictly speaking, inequality tc < 0 does not make sense for a medium with complex £(&)//( co) and hence 

complex However, in a typical medium where particles can propagate over substantial distances, the 

imaginary part of product £(ft>)Mt y )cloes not vanish only in very limited frequency intervals, much more narrow 
that the intervals which we are now discussing - please have one more look at Fig. 7.5. 

46 This radiation was observed experimentally by P. Chere nk ov (in older Western texts, “Cerenkov”) in 1934, 
with the observations explained by I. Fra nk and E. Tamm in 1937. Note, however, that the effect had been 
predicted theoretically as early as in 1889 by the same O. Heaviside whose name was mentioned so many times 
above - and whose genius I believe is still underappreciated. 
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Cherenkov 
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Cherenkov 

radiation 

intensity 



(10.131b) 


Remarkably, this direction does not depend on the emission time t\ so that radiation of 
frequency a>, at each instant, forms a hollow cone led by the particle. This simple result allows an 
evident interpretation (Fig. 13b): the cone is just the set of all observation points that may be reached by 
“signals” propagating with speed v(cd) < u from all previous points of particle’s trajectory. 


This phenomenon is closely related to the so-called Mach cone in fluid dynamics, 47 besides that 
in the Cherenkov radiation there is a separate cone for each frequency (of the range in which v(co) < u): 
the smaller is the £((o)/u(oj) product, i.e. the larger is wave velocity v( co) = l/[<s( <»)//(&>)] , and the 
broader is the cone, i.e. the earlier the corresponding “shock wave” arrives to an observer. Please note 
that the Cherenkov radiation is a unique radiative phenomenon: it takes place even if a particle moves 
without acceleration, and (in agreement with our analysis in Sec. 2), is impossible in free space where v 
= c is always larger than u. 


The intensity of the Cherenkov radiation intensity may be also readily found by plugging the 
asymptotic expression (127), with imaginary k, into Eq. (123). The result is 


d£ „ 

M \ 

dco . 

dx 

1 v ( ix« l u J 



(10.132) 


For nonrelativistic particles ( u « c ), the Cherenkov radiation condition it > v( co) may be fulfilled only 
in relatively narrow frequency intervals where the product cico)ju(co) is very large (usually, due to optical 
resonance peaks of the electric permittivity - see Fig. 7.5 and its discussion). In this case the emitted 
light consists of a few nearly monochromatic components. On the contrary, if the condition u > v(cd), i.e. 
u I (,i(o)/n{o)) > 1 is fulfilled in a broad frequency range (as it is for ultrarelativistic particles in condensed 
media), the radiated power is clearly dominated by higher frequency of the range - hence the famous 
bluish color of the Chere nk ov radiation glow in water nuclear reactors- see Fig. 14. 



Fig. 10.14. Cherenkov radiation glow coming from the 
Advanced Test Reactor of the Idaho National Laboratory. 
Adapted from littp://en. wikipedia.org/wiki/Cherenkov radiation . 


47 See, e.g., a brief discussion in CM Sec. 8.6. 
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The Cherenkov radiation is broadly used for the detection of radiation in high energy 
experiments for particle identification and speed measurement (since it is easy to pass particles through 
media of various density and hence of the dielectric constant) - for example, in the so-called Ring 
Imaging Cherenkov (RICH) detectors that have been designed for the DELPHI experiment 48 at the 
Large Electron-Positron Collider (LEP) in CERN. 

A little bit counter-intuitively, the formalism described in this section is also very useful for the 
description of an apparently rather different effect - the so-called transition radiation that takes place 
when a charged particle crosses a border between two media. 49 The effect may be understood as result 
of the time dependence of the electric dipole formed by the moving charge and its mirror image in the 
counterpart medium - see Fig. 15. In the nonrelativistic limit, the effect allows a straightforward 
description combining the electrostatics picture of Sec. 3.4 (see Fig. 3.9 and its discussion), and Eq. 
(8.27) - slightly corrected for polarization effects of the media. However, if particle’s velocity u is 
comparable with the phase velocity of waves in either medium, the adequate theory of the transition 
radiation becomes very close to that of the Cherenkov radiation. 



Fig. 10.15. Physics of the transition 
radiation. 


In comparison with the Cherenkov radiation, the transition radiation is rather weak, and its 
practical use (mostly for the measurement of the relativistic factor y, to which the radiation intensity is 
proportional) requires multi-layered stacks. 50 In these systems, the radiation emitted at sequential 
borders may be coherent, and the system’s physics becomes close to that of the undulators discussed in 
Sec. 4. 


10.6. Radiation’s back-action 

An attentive reader could notice that so far our treatment of charged particle dynamics has never 
been fully self-consistent. Indeed, in Sec. 9.6 we have analyzed particle’s motion in various external 
fields, ignoring the fields radiated by particle itself, while in Sec. 8.2 and earlier in this chapter these 
fields have been calculated (admittedly, just for a few simple cases), but, again, their back-action on the 
emitting particle have been ignored. Only in few cases we have taken the back effects of the radiation 


48 See, e.g., http://delphiwww.cem.ch/offline/physics/delphi-detector.html . For a broader view at radiation 
detectors (including Cherenkov ones), the reader may be referred to the classical text by G. F. Knoll, Radiation 
Detection and Measurement, 4 th ed., Wiley, 2010, and a newer treatment by K. Kleinknecht, Detectors for 
Particle Radiation, Cambridge U. Press, 1999. 

49 The effect was predicted theoretically in 1946 by V. Ginzburg and I. Frank, and only later observed 
experimentally. 

50 See, e.g., Sec. 5.3 in K. Kleinknecht’s monograph cited above. 
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implicitly, via the energy conservation. However, even in these cases, the near-field components of the 
fields (such as the first tenn in Eq. (20a), that affect the moving particle most, have been ignored. 

At the same time, it is clear that generally the interaction of a point charge with its own field 
cannot be always ignored. As the simplest example, if an electron is made to fly through a resonant 
cavity, thus inducing oscillations in it, and then is forced to return to it before the oscillations have 
decayed, its motion will be certainly affected by the oscillating fields, just as if they had been induced 
by another source. There is no conceptual problem with applying the Maxwell theory to such “field- 
particle rendezvous” effects; moreover, it is the basis of the engineering design of such electron devices 
as klystrons, magnetrons, and undulators. 

A problem arises only when no finite “rendezvous” point is enforced by boundary conditions, so 
that the most important self-field effects are at R = |r — r ]— > 0, the most evident example being the 
radiation of particle in free space, described earlier in this chapter. We already know that radiation takes 
away a part of charge’s kinetic energy, i.e. has to cause its deceleration. One should wonder, however, 
whether such self-action effects might be described in a more direct, non-perturbative way. 

As the first attempt, let us try a phenomenological approach based on the already derived 
formulas for radiation power P. For the sake of simplicity, let us consider a nonrelativistic point charge 
q in free space, so that P is described by Eq. (8.27), with electric dipole moment’s derivative over time 
equal to qu: 

p = Zoq. {2= 2 q ^ (10.133) 

6 nc 2 3c 3 4 rce Q 

The most naive approach would be to write the equation of particle’s motion in the fonn 

mu = F ext +¥ self , (10.134) 


and try to calculate the radiation back-action force by requiring its instant power, -F^u, to be equal to 
P. However, with Eq. (133), this approach (say, for ID motion) would give a very unnatural result, 


F self « 


(10.135) 


that might diverge at some points of particle’s trajectory. This failure is clearly due to the retardation 
effect: as the reader may recall, Eq. (133) results from the analysis of radiation fields at large distances 
from the particle, e.g., from the second term in Eq. (20a), i.e. when the non-radiative first tenn (which is 
much larger at small distances, R — > 0) is ignored. 

Before exploring the effects of this term, let us, however, make one more try with Eq. (133), 
considering its average effect on some periodic motion of the particle. To calculate the average, let us 
write 


u 


2 



and integrate this identity, over the motion period, by parts: 


(10.136) 
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r\ z 

^= 2 q 
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■ q 2 1 


3c 4 ns n T 


u • u 
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T 3 1 T J 2 

[iiu dt = — f — - — — — iiu dt . (10.137) 

J T { 3 c 3 4ns, 


One the other hand, the back-action force would give 


' p = -j)v self -yydt. 


(10.138) 


These two averages coincide if 51 



(10.139) 


This is the so-called Abraham-Lorentz force for self-action. Before going after a more serious 
derivation of this formula, let us estimate its scale, presenting Eq. (139) as 


F self = m rii, with r = 


r 


3 me" 4 K£ n 


(10.140) 


where constant r evidently has the dimension of time. Recalling definition (8.41) of the classical radius 
r c of the particle, Eq. (140) for r may be rewritten as 


r 


3 c 


(10.141) 


23 

For the electron, r is of the order of 10'" s. This means that in most cases the Abrahams-Lorentz force 
is either negligible or leads to the same results as the perturbative treatments of energy loss we have 
used earlier in this chapter. 

However, Eq. (140) brings some unpleasant surprises. For example, let us consider a ID 
oscillator of eigenfrequency coq. For it, Eq. (134), with the back-action force given by Eq. (140), is 

mx + mcolx = mrx . (10.142) 


Looking for the solution to this linear differential equation in the usual exponential form, x(t) oc 
exp {/l t}, we get the following characteristic equation, 

A 2 + a>l = rl 3 . (10.143) 

23 j 

It may look like that for any “reasonable” value of a>o « 1/r ~ 10” s' , the right-hand side of this 
nonlinear algebraic equation may be treated as a perturbation. Indeed, looking for its solutions in the 
natural form A± = ±ia>o + A’, with I A’ I « eoo, expanding both parts of Eq. (143) in the Taylor series in 
small parameter A’, and keeping only linear terms, we get 


51 


This formula may be readily generalized to the relativistic case: 


F a = 

1 self 
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3mc 4 Ke n 
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dp p dp p 
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- the so-called Abraham-Lorentz-Dirac force. 


Abraham- 

Lorentz 

force 


Chapter 10 


Page 34 of 38 


Essential Graduate Physics 


EM: Classical Electrodynamics 




(10.144) 


2 

This means that the energy of free oscillations decreases in time as exp {2/1 7} = expj-rutf r t}; this is 
exactly the radiative damping analyzed earlier. However, Eq. (143) is deceiving; it has the third root 
corresponding to unphysical, exponentially growing (so-called run-away ) solutions. It is easiest to see 
for a free particle, with coq = 0. Then Eq. (143) becomes very simple, 

X 2 =rX\ (10.145) 


and it is easy to find all its 3 roots explicitly: X\ = Xi = 0 and X3 = 1/ r. While the first 2 roots correspond 
the values X± found earlier, the last one describes exponential (and extremely fast!) acceleration.. 

In order to remove this artifact, let us try to develop a self-consistent approach to back action, 
taking into account the near-field terms of particle fields. For that, we need somehow overcome the 
divergence of Eqs. (10) and (20) at R — > 0. The most reasonable way to do this is to spread particle 
charge over a ball of radius a, with a spherically-symtnetric (but not necessarily constant) density p(r), 
and in the end of calculations trace the limit a — > 0. 52 Again sticking to the non-relativistic case (so that 
the magnetic component of the Lorentz force is not important), we should calculate 


F ,-ad =\p{r)K{r,t)d i r, 

V 


(10.146) 


where the electric field is that of the charge itself, with field of any elementary charge dq = p(r)crr, 
described by Eqs. (20a). 

In order to make analytical calculations doable, we need to make assumption a « r c , treat ratio 
R/r c ~ a/r c as a small parameter, and expand the result in the Taylor series in small R. This procedure 
yields 


F 


self 


2 1 
3 4 7te 0 


^ (-1)” d n+l u 
hc n+2 nl dt" +x 


Jc/Vjc/V p(r)R" 1 p{r') . 

v v 


(10.147) 


Distance R cancels only in the term with n = 1, 


F, = 


\ d ^ r \ d 2 r'p(r)p(r') 
3c 4ns Q l 
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6ns 0 c 


-u 


(10.148) 


showing that we have recovered (now in an apparently legitimate fashion) Eq. (139) for the Abrahams- 
Lorentz force. One could argue that in the limit a — > 0 the terms higher in R ~ a (with n > I) could be 
ignored. However, we have to notice that the main contribution to into series (147) is not described by 
Eq. (148) for n = 1, but is given by the larger term with n = 0: 
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0 — 


2 1 u 

3 4 ns 0 c 2 


J d 2 r^ d 3 r 

v v 


, p{r)p{r') 
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3 c 8 ns 0 ll R 



uf, (10.149) 


52 Note: this operation cannot be interpreted as describing a quantum spread due to the finite extent of point 
particle’s wavetunction. In quantum mechanics, parts of wavefunction of the same charged particle do not interact 
with each other! 
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This term may be interpreted as the inertial “force” -m e fa 53 with the effective electromagnetic mass 

4 U 

m e{ =-—. (10.150) 


This is the famous (or rather infamous :-) 4/3 problem that does not allow to interpret the 
electron’s mass as that of its electric field. The (admittedly, rather formal) resolution of this paradox is 
possible only in quantum electrodynamics with its renormalization techniques - beyond the framework 
of this course. Note that these issues are only important for motions with frequencies of the order of 1/r 
~ I 0 23 s' 1 , i.e. at energies £ ~ h/r~ 10' 11 J ~ 1 0 s eV, while other quantum electrodynamics effects may 
be observed at much lower frequencies, starting from ~ 1 0 1 0 s' 1 . Hence the 4/3 problem is by no means 
the only motivation for the transfer from classical to quantum electrodynamics. 

However, the reader should not think that his or her time spent on this course has been lost: 
quantum electrodynamics incorporates virtually all classical electrodynamics results, and transition 
between them is surprisingly straightforward. 54 


10.6. Exercise problems 

10.1 . A point charge q that had been in a stationary position on a circle of 
radius R, is carried over, along the circle, to the opposite position on the same 
diameter (see Fig. on the right) as fast as only physically possible, and then is kept 
steady at this new position. Calculate and sketch the time dependence of the 
electric field E at the center of the circle. 

10.2 . Express the total radiation power by a relativistic particle with the electric charge q and the 
rest mass m, moving with velocity u, via the external Lorentz force F exerted on the particle. 

10.3 . A relativistic particle with electric charge q, initially at rest, is accelerated by a constant 
force F until it reaches certain velocity u, and then moves by inertia. Calculate the total energy radiated 
during the acceleration. 

10.4 . * Calculate the power spectrum of the radiation emitted by a relativistic particle with charge 
q, performing ID harmonic oscillations with frequency co and displacement amplitude a. 

10.5 . Analyze the polarization and the spectral contents of the synchrotron radiation in the 
direction propagating perpendicular to particle’s rotation plane. How do the results change if not one, 
but A > 1 similar particles move around the circle, at equal angular distances? 

10.6 . Calculate the time dependence of the kinetic energy of a charged relativistic particle 
performing synchrotron motion in a constant and unifonn magnetic field B, and hence emitting the 
synchrotron radiation. Sketch particle’s trajectory. 



53 See, e.g., CM Sec. 6.6. 

54 See, e.g., QM Chapter 9 and references therein. 
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Hint : You may assume that the energy loss is relatively slow {-d£!dt « co c £), but should spell 
out the condition of validity of this assumption. 

10.7 . Find the polarization of the synchrotron radiation propagating within particle’s rotation 

plane. 


10.8 . The basic quantum theory of radiation shows 55 that the electric dipole radiation by a 
particle is allowed only if its angular momentum change at the transition equals ±fi. 

(i) Estimate the change A L of the orbital momentum of an ultrarelativistic particle due to its 
emission of a single photon of the synchrotron radiation. 

(ii) Does the quantum mechanics forbid such radiation? If not, why? 

10.9 . A relativistic particle moves along axis z, with velocity u~, through an undulator - a system 
of pennanent magnets providing (in the simplest model) a perpendicular magnetic field, whose 
distribution near axis z is sinusoidal: 56 


B = n r i? 0 cos k 0 z . 

Assuming that the field is so weak that it causes only relatively small deviations of particle’s trajectory 
from the straight line, calculate the angular distribution of the resulting radiation. What condition does 
this assumption impose on system’s parameters? 

10.10 . Discuss possible effects of the interference of the undulator radiation from different 
periods of its static field distribution, in particular, calculate the angular positions of maxima of the 
radiation power density. 

10.1 1 . An electron, launched directly toward a plane surface of a perfect conductor, is instantly 
absorbed by it at the collision. Find the angular distribution and frequency spectrum of the 
electromagnetic waves radiated at this collision, if the initial kinetic energy T of the particle is much 
larger than conductor’s workfunction (/). Give a semi-quantitative discussion of the limitations of your 
result. 


10.12 . A relativistic particle, with the rest mass m and electric charge q, flies with the velocity u 
by an immobile point charge q ’, with the impact parameter b so large that the deviations of its trajectory 
from the straight line are negligible. Calculate the total energy loss due to the electromagnetic radiation 
during the passage. Formulate the conditions of validity of your result. 


55 See, e.g., EM Sec. 9.3, in particular Eq. (9.53) and its discussion. 

56 As the Maxwell equation for VxH shows, such field distribution cannot be created in any nonvanishing volume 
of free space. Elowever, it may be created on a line - e.g., on particle’s straight trajectory. 
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